I have checked through previous posts to see if this topic has been covered, but nothing has jumped out. I wanted to get an idea what opinions people have over the inclusion of a Natural Business Key as a mandatory attribute for Hub and Link tables.
What do I mean by a Natural Business Key? Quite simply it is the string that is hashed to create the primary hash key for a hub or link table. This field would be in addition to the business key attributes. Surely, this information is redundant? We already have all the information in the table to be able to derive this string, so why go to the effort and why consume unnecessary space in the database? The answer may not be immediately obvious.
To get to the answer I will walk through a scenario that I believe is very common, where there are differing business key structures for the same business concept. One school of thought is that we build a separate hub for each business key structure. Four different key structures for customer means four separate customer hubs. Hardly transparent and not very integrated.
In the example below shows how two customer hubs might look. CustomerA has a single Customer_Code business key, while CustomerB has a two-part business key consisting of Customer_No and Customer_Suffix.
But we don’t want separate hubs here so why not combine them to something like this?
Yes, this table contains all the same information but there is a problem. For single part business keys the Customer_Code will be populated and the Customer_No and Customer_Suffix will be null. Conversely for a two-part business key Customer_Code will be null. How do we know what was used to generate the hub’s hash key? Did we perform a hash on a single column for the single part key and then a hash on the two-part key, or do we combine all three fields. If that is the case then how do we handle the null values and how can be distinguish them from missing values?
Enter the natural key attribute.
This field tells us exactly how the business key is constructed and what string was hashed to generate the table’s primary key. The table below shows how the data might look:
Key Set Customer_Code Customer_No Customer_Suffix Natural Key ======= ============= =========== =============== =========== ERP AB123 <null> <null> ERP|AB123 CRM <null> AB 123 CRM|AB|123
Notice how the null values have been been excluded from the natural key. Extending the model is also easy. If we encounter a new business key structure the hub table can be extended with the new business key attributes without impacting the existing values, as shown below with the addition of a new numeric Customer_ID business key.
Key Set Customer_Code Customer_No Customer_Suffix Customer_ID Natural Key ======= ============= =========== =============== =========== =========== ERP AB123 <null> <null> <null> ERP|AB123 CRM <null> AB 123 <null> CRM|AB|123 HR <null> <null> <null> 123 HR|123
By the same reasoning link tables should also contain the natural (pre-hashed) key.