Data Vault Business Value
One of my favorite topics to debate and discuss is how easy it is for business people to get access to the data they need, at a pace they need it, when we have the analytics data platform installed as a Data Vault. Naïve pundits will sometimes try to argue that because data is stored and managed in a Data Vault, it is hard for business people to access essential data when they need it. Nothing could be further from the truth. Comments like these try to This obscures the data vault business value proposition.
I think the critics’ arguments about data vault business value and how hard it all is stem from a fundamental misunderstanding. We build Data Vaults, and follow the DV2.0 prescriptive approach, because we want to de-risk deployments. We want to build and install a “robust and malleable data platform.”
We must have an environment that is resilient to changing source data, and scales, generally. The fact that we don’t have to keep rebuilding tables is one the primary advantages of DV2.0. If new data and their form don’t fit into current Hubs and Satellites, we just add more. There is never any need to rebuild things. But I think this might be where the misunderstanding comes in.
Naysayers take the position that architects and data engineers are so busy adding Hubs and Satellites to a Data Vault, the business people are left to keep pace with all the backroom minutiae. “There are so many tables, we don’t know where to start,” they assert. “It’s all so complicated and unwieldy.”
Data Vault Standard: Infomart Layer
But let us remember the DV2.0 standard maintains, as an axiom, the Infomart Layer. It is here where the primary business interface resides. The raw vault and business vault entities are not, or at least should not, be directly accessed by data consumers.
Maybe data science practitioners prefer, in certain circumstances, to access raw or business vaults directly, but this is the exception, not the rule. Report writers, analysts, and actuaries get what they need, consistently, from the Information Mart layer. And this access point is not complicated. And it’s certainly not unwieldy. It is the stable, predictable part of the whole environment. And when business folks enter here, the results are always reliable.
Some people say that the only way for business people to productively interact with a Data Vault is to be trained in all the jargon and modelling details. I have to confess that I routinely debate this thesis with my colleague, Sanjay Pande. He loves to say that training is the best way to learn DV2.0 properly, and get the best out of your investment in DV2.0.
I say, “Of course, if one is inclined to get trained in an authorized class delivered by one of our Authorized Training Partners, the outcome is terrific.” I mean, come on, who doesn’t like trained resources.
But I don’t think this is at all a hard requirement for analysts and other data consumers. In fact, if business people interact at the Infomart level, there really shouldn’t be a need for Data Vault training. I get it. These people are busy.
If you tell them the only way to interact with a Data Vault and get what they need to do their jobs is by heading off to Data Vault training, they may just run for the hills. However, if a data vault is done right, the same team could see the data vault business value and run straight into our arms.
Data Vaults Done Right - Data Vault Business Value
The fact is that as long as your data engineers and architects are properly trained by authorized training partners, they will build the right thing. And that includes, without exception, the user Infomart layer which includes dimensional models, cubes, flat wide tables (the things that Data Scientists like to use), and even web services and virtualization tools.
Data Vaults Done Right, our slogan at the Data Vault Alliance, captures this notion perfectly. If the trained data engineers build the right thing, the business people will be fed the data they need, in a form they want it, without the requirement to become Data Vault-savvy, much less become experts.
Agile Data Vault Business Value Commitment
Certainly, there are process issues that add to the misunderstanding. I preach and bake into the DV2.0 standard, the idea of constant delivery of business value.
So that means behaving flexibly, putting the business’s needs first, and delivering value constantly through an agile methodology. If your organization ignores this part of the standard and operates with a big-bang mentality, you’re going to get what you asked for: failure.
There are no business people at this stage in history who are willing to patiently wait for the technical staff to deliver value. If value isn’t coming, business people will find a way around it, and seek value elsewhere. They may even develop their own value-add environments, or hire external consultants to do so.
But to “blame” the Data Vault approach in this scenario, no. Don’t blame Data Vault. Blame the practitioners who went off into the ditch and misapplied DV2.0. In other words, it’s the people, not the approach that gets organizations in trouble.
Importance of the Data Vault Taxonomy
But I would add that if there is one place to get everyone on the same page that really does matter, it’s in the development and utilization of our business vocabulary: our taxonomy.
Data Vaults operate at their best when business rules, keys, and relationships between data elements are well understood. Here again, this can be done in an incremental, agile way. Let’s avoid boiling the ocean. But to get everyone using data aligned with those that are sourcing, aggregating, and modeling data can be a magical moment.
Data catalogs help, for sure. But however you do it, building a sustained business taxonomy will ensure that business users not only get the data elements they need, but also are able to interpret the data in a context of how to apply it.
One analysts’ interpretation of “product margin,” for example, should never be confused with another’s interpretation. Margin is margin. And if it isn’t for your organization, you probably need more words in the taxonomy that are truly unique.
Maybe one practitioner’s “margin” is somebody else’s “contributed margin,” or another designation altogether. This is something that Bill Inmon has talked to me about for years. It is also in line with much of the discussions we are hearing about the Semantics Layer; i.e., deriving information based on the business meaning.
Whether it’s from poorly defined taxonomy, or ignoring the agile-delivery part of the DV2.0 standard, or by dismissing the Infomart layer of the Data Vault, we can see how misunderstandings get created, and Data Vault 2.0 gets unfairly criticized.
But to suggest that it’s somehow hard to get data out of a well-designed Data Vault, developed and deployed by trained and competent architects and data engineers, is simply wrong. Getting data out is easy, and is part and parcel of the DV2.0 standard.