A lot has been published about getting data into the vault but what about getting data out of the vault?
Sure the passive integration and centralizing of integration is great but it requires effort and business understanding to get that integration right. Right down to ensuring that there are not business key collisions across source system entries into common hub tables.
Now you're in a position where data is loading seamlessly into data vault a surprising (and common) question is how do I get it out? Why do I now need to write far more lines to query the data when it would have been easier to query this stuff from a flat file?
We have written Query Assistance templates for clients and are in the process of designing a query-builder where all you need to do is select the data vault artifacts you want on a GUI, how they should join and the query builder generates the code underneath. This will be extended with tasks and a canvas for a data flow -- just like SAS enterprise guide does the same thing for for data scientists.
What is your experience when the customer realizes that they SQL needed to get the data out is suddenly (exponentially) larger than a similar query written for dimensional marts?
How have you solved this?
And... are query planners at a maturity level that they solve this optimally?