Monthly Archives: March 2006

DAMA-PS Session: Forget Federated

I had listen to Stephen A. Brobst, CTO at NCR Teradata about “Best Practices in Meta Data Management and Enterprise Data Warehouse Deployment” this morning. I was hoping to grok some details about new metadata management techniques, but what the presentation was much more focused on the “deployment/architecture” side. That being said, I think it was MUCH more useful for the the audience as a whole to cover the deployment in as much detail; I personally didn’t find it all that groundbreaking.

Summary: Build an EDW based on a 20,000 ft blueprint, integrate your data into a single location (say, perhaps, a massively scalable single system image shared nothing database) using a relational schema, build a subject area at a time, and star schemas only when performance is an issue. Clearly the architecture is advocating the Oracle/Teradata/etc view of the world that says ONE GIGUNDOUS RELATIONAL warehouse with various semantic (dimensional) and performance (materialzed denormalized structures) views into that world. I’m not being sarcastic; I model most of my customer implementations off the CIF and think it’s a good approach from a TCO perspective.

The key takeaway remains: if the data isn’t integrated, it won’t be useful. An EDW promotes this, but it’s not the only way. You start to realize more value as the richness of relationships and entities increase in the integrated view.

One of the things I have a beef with is that the “Semantic Layer” (metadata for dimensional modeling, data freshness, etc) can be used instead of ETL and building physical star schemas. I make no issue that the reporting tools, database, and platform can adequately do this, but rather how is it managed? For example, if I define my dimension LOGICALLY, and let the REPORTING tool build a cross tab based on that dimension that should work. BUT, how is that managed as part of the information lifecycle? I’ve seen VERY few tools that can tell you: on day X I was generating this dimension this way, and provided it to these 20 reports using this particular method. ETL tools building PHYSICAL structures are usually managed (think source code, or some sort of build and release system). In other words, if you see a report based on the “CUSTOMER COUNTRY” and a date one can say PRECISELY how that was generated because there’s a managed solution (ETL, Database Structures) in a repository somewhere that tell you what your logical and physical ETL were at that point in time. Good luck doing that when someone is able to change this on the fly in a clever “report writing” tool.

Sorry Discoverer gurus… I’ve never been a fan of “faking” real dimensional structures with clever SQL generation. Not because it doesn’t work or won’t peform, but the management life cycle of reporting tools and configuration are ages behind the ETL tools. Not saying they’re great, but… you get the point.

Overall I enjoyed Stephen’s presentation. GREAT SPEAKER, actually! My favorite line from the day: “Data Marts are kind of like rabbits in the data center. They just start multiplying!.” 🙂

Off Topic: Two G's

UPDATE: Some people have emailed asking if everything is alright. Overall the changes are very positive but thanks for all those concerned!
UPDATE 2: My 95% figure was apparently, grossly incorrect. Assuming you are an IT making more than 35000 GBP, or 62000 USD, or 52000 EUR annually you are in the richest 99% of the world. Once again… we are so very fortunate, collectively. Happy to update and correct it here! 🙂

My life is changing… some professional aspects and personal aspects. Emotions are a part of life and I’ve been facing rational and irrational emotions and thoughts about change. I’m excited and anxious all at once…

And you want to know what? Gratitude and genorosity are the most POWERFUL soothing emotions.

Grateful that you have a roof over your head, food in your cabinet, health, loved ones, and the opportunity to chase dreams professionally. The human condition longs for wanting more, but being thankful for what you have is a powerful idea. Humbling and potent!

Genorosity, or giving with no thought of return, speaks to us all as human beings. Giving some time to strangers in need, random acts of kindness, etc.

So… if you’re reading this blog you are likely an IT professional interested in Oracle, Data Warehousing, Business Intelligence or Open Source; you are better off financially than 95(99)% of the people on planet earth. Consider taking 30 seconds that you spent reading this blog and explore the things you are grateful for. I bet you’ll feel better with that knowledge than any OWB trick or Open Source BI insight you could have grokked in the same 30 seconds.

And now… we return you to regularly scheduled programming.