Intro to AI, ML, DL – NLP and #DataVault 2.0
A look at how Data Vault Models can help Natural Language Processing, AI, ML, and DL algorithms.
A look at how Data Vault Models can help Natural Language Processing, AI, ML, and DL algorithms.
By Cynthia Meyersohn, DataRebels LLC Do you really want your business or enterprise to be data–driven? Before you say, “Of course we do. Companies that do not embrace a data-driven approach to competition, market strategy, and process improvement will become extinct.” Understand, this is a trick question. The term data-driven has been around forever it…
Understanding #datavault 2.0, why DV2 is important, comparing with other data modeling methods, and how it works for #bigdata #iot, #streaming, #datawarehouse, and #datascience
Introduction The industry has been struggling for a long time with defining a data lake. We are taking the plunge, let’s properly define a data lake. I have seen hundreds of different definitions around the world, and none of them seem to provide an organization with the foundations they need to build a successful data lake.…
By Cynthia Meyersohn To close out this series of articles that have been focused on the data replication introduced by the processes outlined in “Build a Schema-On-Read Analytics Pipeline Using Amazon Athena”, by Ujjwal Ratan, Sep. 29, 2017, on the AWS Big Data Blog, https://aws.amazon.com/blogs/big-data/build-a-schema-on-read-analytics-pipeline-using-amazon-athena/ (Ratan, 2017), I will talk about an approach that I…
By Cynthia Meyersohn Picking up from Part 2 of this series where we left off having replicated the data a minimum of nine times, we will continue to identify additional data replication stages as we trace through the data processes outlined in “Build a Schema-On-Read Analytics Pipeline Using Amazon Athena”, by Ujjwal Ratan, Sep. 29,…
By Cynthia Meyersohn Continuing from Part 1 of this series, this article is following the breakdown of the AWS Schema-on-Read analytics pipeline with a focus on data movement and replication. You may recall we are tracing through the data processes outlined in “Build a Schema-On-Read Analytics Pipeline Using Amazon Athena”, by Ujjwal Ratan, Sep. 29,…
By Cynthia Meyersohn Last August I was asked to participate in an exercise to assess whether or not the Data Vault 2.0 System of Business Intelligence (DV2) still held value. Specifically, I was asked the following: Does the size of the data now being delivered to the business make a compelling argument that the data…
Challenge: Automated Build Out Results: 2 people in 2 weeks merged 3 systems, built a full EDW 5 star schemas 3 BI reports zero production errors Generated: 90% of the ETL code 100% of the Staging Data Model 75% of the finished EDW data Model 75% of the star schema data model The competing bid?…
Challenge:Rapid Merger and Acuqisition Results: Merged 3 companies in 90 days (Circa 2001) ALL systems, ALL DATA! 125 people, multiple cultures Business Value produced during Due Diligence phase. JP Morgan Chase used the Data Vault model, methodology and architecture to perform due diligence, and ultimately to merge 3 companies in 90 days. I met with…
Thanks for reading. Subscribe to get the latest blogs, podcasts and notifications.
A look at how Data Vault Models can help Natural Language Processing, AI, ML, and DL algorithms.
By Cynthia Meyersohn, DataRebels LLC Do you really want your business or enterprise to be data–driven? Before you say, “Of course we do. Companies that do not embrace a data-driven approach to competition, market strategy, and process improvement will become extinct.” Understand, this is a trick question. The term data-driven has been around forever it…
Understanding #datavault 2.0, why DV2 is important, comparing with other data modeling methods, and how it works for #bigdata #iot, #streaming, #datawarehouse, and #datascience
Introduction The industry has been struggling for a long time with defining a data lake. We are taking the plunge, let’s properly define a data lake. I have seen hundreds of different definitions around the world, and none of them seem to provide an organization with the foundations they need to build a successful data lake….
By Cynthia Meyersohn To close out this series of articles that have been focused on the data replication introduced by the processes outlined in “Build a Schema-On-Read Analytics Pipeline Using Amazon Athena”, by Ujjwal Ratan, Sep. 29, 2017, on the AWS Big Data Blog, https://aws.amazon.com/blogs/big-data/build-a-schema-on-read-analytics-pipeline-using-amazon-athena/ (Ratan, 2017), I will talk about an approach that I…
By Cynthia Meyersohn Picking up from Part 2 of this series where we left off having replicated the data a minimum of nine times, we will continue to identify additional data replication stages as we trace through the data processes outlined in “Build a Schema-On-Read Analytics Pipeline Using Amazon Athena”, by Ujjwal Ratan, Sep. 29,…
By Cynthia Meyersohn Continuing from Part 1 of this series, this article is following the breakdown of the AWS Schema-on-Read analytics pipeline with a focus on data movement and replication. You may recall we are tracing through the data processes outlined in “Build a Schema-On-Read Analytics Pipeline Using Amazon Athena”, by Ujjwal Ratan, Sep. 29,…
By Cynthia Meyersohn Last August I was asked to participate in an exercise to assess whether or not the Data Vault 2.0 System of Business Intelligence (DV2) still held value. Specifically, I was asked the following: Does the size of the data now being delivered to the business make a compelling argument that the data…
Challenge: Automated Build Out Results: 2 people in 2 weeks merged 3 systems, built a full EDW 5 star schemas 3 BI reports zero production errors Generated: 90% of the ETL code 100% of the Staging Data Model 75% of the finished EDW data Model 75% of the star schema data model The competing bid?…
Challenge:Rapid Merger and Acuqisition Results: Merged 3 companies in 90 days (Circa 2001) ALL systems, ALL DATA! 125 people, multiple cultures Business Value produced during Due Diligence phase. JP Morgan Chase used the Data Vault model, methodology and architecture to perform due diligence, and ultimately to merge 3 companies in 90 days. I met with…