I received this question from @joelwittenmyer here on the forums, on another topic. I want to explore this for you: (This is now FREE and PUBLIC knowledge for all to read)
Are the relationships in a link restricted to those found in a single source system table? I.e., Source System Table A (SSA) has a transaction number and a policy number. Source System Table B (SSB) has a policy number and a customer number. Does that mean that we are restricted to building a policy-transaction link, and a customer-transaction link, but we should not populate a link table with a policy key, and a customer key, and put the transaction number in a non-historized satellite because we are creating a relationship that does not exist in it's entirety in a solitary source system table?
The short answer is:
- Yes for Raw Data Vault Links
- NO for Exploration Links (also known as Business Driven Links, also known as Business Data Vault Links)
The raw data vault links are setup to represent an auditable trail of data back to the source system. It really is that "simple" when it comes to brass tacks - this ensures audit ability as well as lineage, traceability, and data integrity. It allows the design to "stick to the standards" without bending or breaking.
That said: MOST people "stop their thinking at that point" - then complain that the DV is too rigid. Unfortunately this is "closed box thinking" and can get you and your team in to trouble. Most people "forget" that there is a notion there called an Exploration Link (taught in CDVP2 Certification courses).
Exploration Link is the concept. Alternate Names for this concept include:
- Business Driven Link
- AI / ML / DL Discovered Link (generated by ML etc...)
- PIT Table
- Bridge Table
- PIT-Bridge Hybrid
- Aggregate FACT table
- Business Driven JOIN table
- Security Access JOIN table
The exploration Link can be, should be your best friend. I really, really encourage everyone reading this to understand this principle reaches deep in to the flexibility that you can achieve with the Data Vault. DON'T GET CAUGHT UP THINKING THE DV IS SO RIGID YOU NEED TO STICK INSIDE THE BOX!! Once you have audit ability (at the raw level), then feel free to PLAY!!
Add the structures you need at the business level to accomplish the business use cases as you see fit.
Now please don't take this as a "free license to run around and create hundreds or thousands of join tables" - that is NOT what I am saying. What I am saying is: take a minimalist approach - ONLY build the join tables / Exploration Links that you do need in accordance with the business use cases.
Remember: ONCE they have out-lived their usefulness, DELETE THEM (remove them permanently) because you still have the raw data at the raw level for audit ability. BUT do, back up the design specs and any documentation you generated to support the object. Store that somewhere like a wiki - where you can search it in the future if you ever need it back.
Once you've built one of these objects, you can play - you can add the columns you need at any level of detail you want. You can even use ML / AI / Deep Learning to generate the outcomes here.
These objects can be extremely valuable in your toolbox to agility, and solving complex business problems -delivering simple answers to business where appropriate.
Hope this Helps,
PS: we teach this in CDVP2 certification courses.