By Cynthia Meyersohn, DataRebels LLC
Do you really want your business or enterprise to be data–driven? Before you say, “Of course we do. Companies that do not embrace a data-driven approach to competition, market strategy, and process improvement will become extinct.” Understand, this is a trick question.
The term data-driven has been around forever it seems as companies and government entities have begun to realize the possibilities afforded them by rightly discerning and more fully utilizing the data they have been collecting for so long. Data-driven is a term that has been thrown around as the place to be for every enterprise that wants to excel beyond the competition. However, the term data-driven also strikes me as being sterile, short-sighted, inhumane, and lacking in wisdom … void of both insight and consideration. Hence, my opening question.
I would argue that what your enterprise is truly seeking is to be information–driven, and not data–driven; because data and information are mutually exclusive.
Data is a version of the facts collected at a specified moment in time; whereas information is the integration of a variety of facts that have been correlated, summarized, munged, extrapolated, interpreted, interpolated, and had any number of business rules and/or data science algorithms applied to it to provide decision-makers’ with actionable choices based on reliably calculated and predictable outcomes. Clearly, data and information are not the same thing.
It would appear that both commercial and public sector organizations have been on this data-driven crusade for the last 30+ years, with heated acceleration in the last 10 years. The massive investments in infrastructures, hardware and software platforms, and personnel should have, by this time, resulted in much greater utility for the investments expended.
Sadly, this is not the case in the majority of analytic implementations. What is lacking is a solidly engineered foundation capable of delivering quality information, not just data.
Why the repeated failures?
In our frenzy to manage and understand the vast amount of data that we are hit with on a second-by-second basis, I believe we’ve lost the true definition of what a data warehouse is and what it delivers to the business in terms of value and strategic advantage. The term “data warehouse”, and the misinformed concept of it, has become likened to a red-headed step child (no offense to the gingers out there). We’ve settled, in certain aspects, for vendor- or tool-provided definitions of data warehousing and in settling have discounted its true meaning.
The term data warehouse was defined by William Inmon and published in 1991. Mr. Inmon updated the original definition in 2018 to read –
“A data warehouse is a subject-oriented, integrated (by business key), time-variant and non-volatile collection of data in support of management’s decision-making process, and/or in support of auditability as a system-of-record.’
Since its definition, vendors have tried to tie the term “data warehouse” to their specific tool, platform, or technological offering. Nothing could be farther from the truth.
A data warehouse is platform agnostic by sheer definition; and therefore, it should be freed of any limitations imposed by a specific technology or platform just as it should be freed from being dependent upon any specified set of tools or technological offering.
Quite frankly, technology is just one part of a data warehouse implementation. Planning and building a data warehouse for your business involves three critical components, not one. The approach must take into consideration People and Process, not simply Technology. If it is subject-oriented then there are processes involved; if it is in support of management’s decision-making process then there are people and processes involved; if it is a software development effort (which it is) then there are teams of people involved. And these are just a few of the people and process aspects.
I believe that one of the major contributing factors to data warehouse implementation failures is negligence in recognizing the fact that a data warehouse (or analytics solution) is not tied to technology alone. As has already been painfully experienced by many companies and public entities, moving data to a platform (Hadoop, Apache Spark, Cloudera, etc.) or utilizing Cloud storage solutions (AWS, Azure, Google Cloud) does not a data warehouse make. Nor does switching the underlying database application systems (NoSQL, NewSQL, graph, columnar, etc.) from whatever legacy DB engine was there initially.
Granted, the newer technologies afford advances that help us overcome many of the constraints originally imposed by older, legacy infrastructures and environments, however, they don’t replace the foundational requirements pivotal to a successful analytics implementation.
Successful analytics implementations require a solidly engineered foundation that embraces a variety of platforms, tool sets, and technologies. Engineered foundations, like the Data Vault 2.0 solution, provide the business-centric methodology, architecture, and model needed to rapidly develop and deploy an analytic environment that harmonizes the three major components – people, process and technology – to deliver true business value.
A correctly implemented Data Vault 2.0 solution results in an Information Delivery Layer that provides integrated and auditable outcomes for the business. From such a foundation the business value is achieved in the quality of the information available for management’s information-based decisions with the added value of overall enterprise-wide improvement. Couple that with a modeling and process design that is repeatable, consistent, and pattern-based and you have the compounded benefit of utilizing automation and auto-code generation tools to accelerate your solution delivery.
After all, isn’t optimizing your investment to realize business value the end-game? So position your analytics solution to move your company beyond merely being data-driven … become an information-driven organization.
For more information on the Data Vault 2.0 system of BI and Analytics visit https://www.datavaultalliance.com.