Notifications
Clear all

How to incorporate unstructured data.


Posts: 2
Topic starter
(@janetmcfarland)
New Member
Joined: 2 years ago

Dan wrote an article on his website back in 2010 and I'm wondering if it still applies for DV2.0

The article was:

https://danlinstedt.com/allposts/datavaultcat/unstructured-data-and-the-data-vault-model/

Do the following assumptions outlined in the article still apply with DV 2.0?

there are some assumptions about unstructured data (ud) that we need to go through first:

1. ud must be mined for content and context, it’s the results of that mining that are important to hook in to the structured world
2. ud can take many forms, and there’s an argument about what is, and what is not ud. my meaning of ud is as follows: word-docs, excel spread-sheets (semi-structured), e-mails, images (jpg, png, gif, etc…), text documents, movies, and audio files.
3. ud should remain in the file system, putting it in a data warehouse is slow and cumbersome (it can be done if you have the time).
4. it’s the results of the data mining/statistical analysis that are important to align to the structured data world, being able to “query” the ud in combination with the structured data is very important.

 

Could someone please provide me with a quick summary about what would be required in order to leverage unstructured data in a DV2.0 Data Vault?

 

Thanks,

Janet

Reply
3 Replies