Hello I'm new to Data vault.
Read the Dan's book "Building a Scalable Data Warehouse with Data Vault 2.0" and there unfortunately examples without RDBs (PrimaryKeys) and the answer I did not find there. There is a structure of tables of the contractor, the product, the order (see the scheme).
I have some questions:
1. What to do with the primary keys of the source tables for the hub? Where to place them in DW (for example for tracing) ?.
2. How to load an Order_Lnk if there is no business key in the source. Here I see several options:
- Lookup by saved somewhere in datavault raw ContractorId, ProductId (for example in Sat).
- In Staging area Join link source table where business keys but it needs Persistence Staging area.
Both options I do not like because of breaking the standard and its receptions: parallel loading link and hub, to ship portions only changes, lack of lookup and simple loading from stage to dv . In different Internet sources I found options to use in hub as BK PrimaryKey source, and at BuisnessVault level to integrate through Same As Link. In the book "The Elephant in the Frige" clearly shows how unpleasant consequences it leads and is source centric DataVault (aka Source Syndrome).
3. The order has a link to another related order (or if the Contractor had a link to the last order made) - does this mean that in Datavault it is required to have an Order_Hub entity in this case and between and make SAL. As far as I know in standard it is not recommended to do Link to Link. How to be? in essence, there is no Buisness Key ie it is actually wiring. PS this is not a real DB structure but an example of cases with which I have already encountered
PS: This is not a real dB structure but an example of cases with which I have already encountered