Build what they want not what they need  


Posts: 113
(@patrickcuba)
Estimable Member
Joined: 5 months ago

A somewhat controversial title but I'm glad you've come this far.

If you are a consultant you are paid to consult. Steve Jobs famously said he doesn't like consultants because they do not own the decisions. I happen to disagree with him. Some of us are in the industry preaching what we practice (intentional inversion of the phrase) because we have a passion for it. We are emotionally entangled in the outcome. 

We've been taught the standard and as certified DV2.0 practitioners we believe in it. I carry my training manual with me everywhere just in case. I have seen unguided implementations of data analytics so many times that I could only describe what was implemented as genius.... I would never have thought of it!

That's just a play on applied experience but the truth of the matter is we will get to consult and make recommendations on what we know is the best practice but sometimes despite the documented advice (or gut feel) we preach we may be forced to implement what we do not agree with.

It is the nature of consulting: you're advising, someone else is taking the ownership and despite your best efforts...

This is a summary of the recent battles I have fought and the outcome:

1) modifying satellite table columns to indicate a lookup to a reference table

RAW data model (imho) column names should not be modified in any circumstance to "indicate" that they point to a ref table; this metadata must be catered for in the business glossary.

- what if through schema evolution an additional column that is added to the satellite table is given the same name?

- if you do this you are applying a business rule and a maintenance point to raw vault

- raw vault satellites are what they are labeled ... raw - minus the business keys.

outcome: I won! RAW is RAW!

2) passive integration by keys

we lower cased our business keys but were overruled despite our best advice!

a. if a business key is loaded and its case differs from an existing business key but it is the same business entity it will create a new hash key. In the majority of source systems business keys are insensitive.

b. this means we have multiple timelines for the same business entity and must resort to a business vault to resolve this anomaly, integrate the timelines and hope for information marts to resole this technical debt. BV is an unnecessary additional hop in this process. 

c. if we leave it up to the info marts then the logic to combine business entities is replicated across the the EDW wherever such integration is needed. 

outcome: I lost! (kind of) But through my application of business key treatments we hold the source accountable for what they send us. In two words: bitwise joins: https://www.linkedin.com/pulse/business-key-treatments-patrick-cuba/

I accepted their argument (source must match what gets sent to regulatory) but with a caveat that you must stop the source from sending us crap!

3) querying data vault

This is an unfortunate topic because like a teacher I really want my students to "apply themselves".

My consulting has resorted to.... teaching SQL... 

a. analytical functions to virtualise END Dates

b. to represent multiple account numbers for the same account... ala writing a join query where you join a link to the same hub twice (or more).

c. persuading customers that they do not need PITs to simplify their queries... they need to attend an SQL course! I'm just kidding there is no course.

d. automating a QAssist layer above the vault that simplifies the joins between hubs+satellites, links+satellites, either with dependent-child keys and all with depth (historical views) and current (latest data).

4) loading bkeys to satellites

On a previous post I outlined why I wouldn't do it but it was a battle I had to fight.

outcome: I won! reload corrected data vault model.

 

I feel like it is a matter of negotiation and a paradigm shift to what the customer is used to:

- let's not build a future legacy system

- we are not conformed to overnight batch

- you could load any part of the data vault at any time!

These are concepts data warehouse managers are not used to and in the earlier points something customers could lean towards without the confidence in knowing it's the the wrong path.

 

I want what they need!

They need what they (think they) want!

Reply

Please Login or Register