“Science can amuse and fascinate us all, but it is engineering that changes the world.” —Isaac Asimov

You’re already aware about the Data Vault 2.0 System of Analytics having 3 core pillars which are the Model, the Methodology and the Architecture. Methodology also includes the sub-pillar of implementation. And, it has data engineering baked in.

Now, most architectures do have patterns and/or are based on patterns, but the Data Vault is unique in terms of having a very engineering driven design paradigm and it applies across people, process and technology.

You’ll find so many data engineering fundamentals being addressed such as flexibility and repeatability.

For engineered solutions, one cannot work without patterns.

For something that you want to build for the long-term, these become even more important. Just like an engineered structure requires some thought upfront and design, the Data Vault does require it too. You cannot build a bridge without standardized component parts, or a building without standardized bricks as an example.

But Data Vault 2.0 has the advantage of being engineered digitally instead of physically and therefore can use several agile principles to let you build the foundation in stages.

The focus has always been to build a usable part first. And, then move on and build another usable part. And so on and so forth.

The fact that the components end up fitting like toy blocks in the Data Warehouse is intentional and by design. Every new element added to the Engineered Data Warehouse enriches the existing sets.

But we all know that when we start pushing things into the Data Warehouse and follow all the agile methodologies, we will end up with some data that just wasn’t integrated correctly in the sources. You will see these as entity and identity resolution issues in your data as examples.

So, where do we actually resolve these?

It depends.

If you have a reliable resolution and mapping available, it can even end up in the raw Data Vault itself. Otherwise, it’s a Business DV construct which is another important engineering component but designed for the business to enhance sharing and re-usability.

According to Matt Florian, “The ability of DV 2.0 to be both a methodology and an engineering pattern makes it a powerful tool when building the enterprise data pipeline.”

In his upcoming talk at the WWDVC titled “The Data Vault as an Engineering Pattern”, he’ll talk about the model, architecture, implementation, Ways of Working and Agility, and even physical design.

Which is yet another session you wouldn’t want to miss at the upcoming WWDVC. Just this session covers:

How the Data Vault can serve several data engineering pattern that can be applied to various projects.
The Data Vault’s unique architecture and methodology can provide a flexible foundation for data warehousing and analytics.
How to build and maintain an efficient and scalable data infrastructure using the Data Vault.
Streamlining data modeling, development, and deployment processes with the Data Vault.
Examples of how to leverage the Data Vault methodology to improve data engineering capabilities.

And, there’s so much more.

With a first time business Monday which focus on answering the pertinent business questions, there’s something for everyone at this years WWDVC. Last I checked, there were literally a handful of tickets remaining and not much time at all. They may already be sold out by now, but you can still check at -> https://wwdvc.com/#tile_registration

It’s best when data engineering is done with a longer term vision, and with design for a solution that will outlast anything else. That’s pretty much the only way where you truly treat your data as an asset.