![]() |
![]() |
{GSA} | |||||||||||||||||
| Knowledge Integrity | Column Archive/Last Year's Model | ||||||||||||||||||
|
Last Year's Model - Published in www.businessintelligence.com October 2003 I recently heard a conversation, perhaps tongue-in-cheek, where this question was raised: When should a system be considered a legacy system? The answer: right after it is launched into production. Of course, we seriously wouldnt consider our newest system to be a legacy one, but the implication in this joke is that as soon as we have committed to the structure and implementation of a system, it becomes incredibly difficult to modify that system should the need arise. As environments mature, there is a growing hesitancy to actually change anything without verifying that the change will not jeopardize the systems already in place. And each system or application that is currently in production at your location is more likely to be of an older vintage than recently deployed. Yet each application that actually does make it into production was probably designed and programmed to solve a specific business problem, and the concomitant implementation decisions reflect the needs relevant at the time the business problem was being addressed. This means that the underlying data models are probably designed only to support the specific implementation of that solution, and understanding how difficult it is to change things once they are part of business operations, those data models are probably locked in the time of their implementation. We have statically bound what we might call temporal or situational meta assumptions into the application, even though the outside world that is supposed to be reflected in the data model is constantly changing. The implication is that today, when attempting to aggregate and rearrange information coming from different systems across an enterprise, we will need to deal with extracting data from models that have implicit business rules relating to their original design and implementation. As an example, think about this: Twenty years ago, most residential customers lived in a residence that had a single, physically attached telephone line. Therefore, it would not be unusual to assume that each individual was connected to a single telephone number, and that telephone number was bound to the location. Today, many locations have multiple physical telephone lines connected, as opposed to the earlier convention of one telephone number per residence. Alternatively, many individuals, in addition to their homes land lines, carry mobile telephones as well as pagers, and in this case there is an associated telephone number that is unbound to a specific location. Lastly, there are some people who have given up their land lines, preferring to use a mobile phone as their only telephone number. Each of these cases demonstrates a significant change in the perceptions associated with the relationships between an individual, a residence location, and the means of contacting that person by telephone. If you look at data models designed even as late as a few years ago, youll find data tables designed to represent individuals typically incorporate name, address, and a telephone number all within the same table. Yet, just a few years later, the assumptions embedded within the original design no longer fit a changed world that it is meant to model. There are other examples of this phenomenon, such as the assumption that every location has a mailing address. In this case, data tables designed to hold location information incorporate the standard address line 1, address line 2, city, state, and ZIP representation. Yet, in some cases one might want to represent locations that are not likely to be receiving mail, such as:
In each of these cases, the traditional modeling approach for addresses is insufficient, although a scan of the actual information that is sometimes shoe-horned into these tables shows that information clients occasionally try to adapt existing models for these alternate purposes. Another example involves the difference between accounts and the individuals holding those accounts. A review of any large financial institutions data model may reveal that the company does not distinguish between an account and the customer whose name is on the account. But in reality there may be more than one individual associated with each account, and any individual may be associated with more than one account. How does this relate to business intelligence? As I mentioned earlier, our goal as BI professionals is to aggregate data from across the enterprise into an analytical repository. The main issues revolve around two aspects of the same problem: determining what meta assumptions were in vogue at the time that the data model was set in stone, as well as determining how to extract the relevant information (especially in the context of the meta assumptions) into a model that is meaningful in the BI repository. From this you can draw the conclusion that there are additional complications to the data extraction process that you probably did not consider. I believe that the real issue can be abstractly understood as one relating to the semantics of how operational applications make use of their associated databases, and in turn, how can those semantics be revealed, documented, and understood. So how does the savvy manager proceed? Ill give you readers a month to ponder on that question, as I plan to discuss solution approaches in next months column.
|
|
|||||||||||||||||
|