Business keys are a central principle of the Datavault methodology. Our entire modelling efforts focus on identifying business concepts around which data can be organised. From an enterprise perspective it should be relatively easy to identify these concepts: Customer, Account, Product, Branch, etc. Equally it should be easy to locate key values within enterprise systems and processes that identify these concepts. Right?
20 years ago, if I was to phone up my insurance company, the first question I’d be asked for is my policy number. If I didn’t have the number to hand it would be a song and dance to search for the appropriate details. When they did find it, you knew that they were writing it down to use in searches in other systems. That policy number was key (pardon the pun) to the way data was organised and stored in the database.
Nowadays, the conversation is slightly different. I don’t need to provide my policy number, the systems are now sophisticated enough to allow my details to be retrieved in a variety of ways, using my name, telephone number or address. In most cases the policy number is simply an attribute (sometimes optional) buried in one of the tables, with the systems using dumb surrogate keys rather than human readable codes. This is simply good design, but it means that our central integration key is lost.
I’m sure that most of us have noticed more and more the prevalence of surrogate keys and the increasing difficulty in identifying reliable business keys. Each system generally tracks its own data with its own set of surrogate keys that are not shared, leading to limited opportunities for passive integration and, therefore, a need for expensive and potentially unreliable active integration.
My question for the DV community is: what strategies do you leverage in order to address the hunt for usable business keys? How do I easily integrate across multiple systems that each use different non-intelligent keys to identify the same concept? If passive integration isn’t achievable then does Datavault become more of a burden than an enabler as we build our data warehouse?