Back in May Dan Morgan was kind enough to invite me to do a guest lecture at the University of Washington about “Data Warehousing Basics.” After having emailed these slides as a decent overview to a few customers lately, I realized they’d probably be useful online. It is obviously a little light on content (their just slides) but they do provide some good “high level views” of dimensional modeling/DW/BI in general. My employer, Pentaho, was generous enough to allow me time to build this presentation for the students at UW for which they, and I are grateful. THANK YOU!
My cliffs notes:
- If you’re doing a BI or DW project find “that guy” with the MS Access database or Excel jockey that sits outside the COO’s office. Make him your BEST BEST friend.
- Facts are “What” / Dimensions are “How” Good graphics that drive the point home:
Thanks to Matt Casters for leeting me pillage some of his graphics and slides.
Kettle: Portable ETL Showing how to use paramater injection to make your Kettle solution (Jobs and Transforms) executable inside of Pentaho.
Kettle: Custom rollups using Excel Showing how to build a dimension, reporting table, etc using a very easy to use interface for business users.
Reporting: List of Values Show how to use the most unfortunately named Secure Filter component to do list of values (even though you are not REQUIRED to do any security). Not very eloquent but the suggestion has been to call it a “Prompt For” component (see below). Think “parameter page” driven by “select distinct name from my_reporting_table.”
Report Designer: How to build reports with Charts The latest release included the charting expressions so now one can build reports with lovely looking charts.
Report Designer: How to pass “Pentaho” parameters to reports This allows the building of drill thru parameters, titles, and other “context” from the server
Pentaho Spreadsheet Services: Your data looking sexy in Excel A quick how to of how to get an instant excel analytic interface into ANY database. Example with Oracle XE.
Comments are ON… vote, have your say. I WANT to do all of these, and will, eventually. What do YOU want to see?
The pentaho build process doesn’t currently manage the permissions on .sh files properly. When you download the daily builds or other demo installations you may get some errors (bash command not founds, etc). You need to change to executable all .sh files in the installation. Use the following command in the “pentaho-demo” directory.
for x in `find . -name ‘*.sh’`; do chmod +x $x; done
I’m not talking about the methodology in particular, I’m just saying compared to traditional software engineering practices with customer advisory boards vetting major features, rounds of marketing approvals of features, etc.
For instance, I submitted a Jira case to the Pentaho development staff for including a jar in our demo application need to run certain Pentaho Data Integration mappings. In 20 hrs the jar had been included (already vetted for license since it’s part of another project) and is now part of the daily builds. This is the oil that makes the open source machine great; ability for software (Pentaho as a project) to respond to real customer needs (from me). It’s awesome!
Now that reminds me, I hadn’t highlighted some of the cool new “open source — eee” things at Pentaho yet:
Public Issue/Feature Roadmap: We have launched Jira as a place to track new feature requests, bug submissions, etc. I greatly encourage you to register and begin using it to submit bugs / suggestions. Can’t always say they’ll get fixed in 20 hours but they have a MUCH GREATER chance of being fixed if they’re in Jira in addition to the forums.
Public Source Control: While we’ve always published our source with every release that source repository wasn’t available to anyone on an anonymous basis. We’re hosting a subversion now that allows easier access and contribution from our always valued community. Consider this an open invitation to dig in, build a cool plugin, etc.
I’m glad these two things have happened; I think it just makes communication easier, effective, and more transparent. What do you think?
I’ve recently made the switch to Linux as many of you have read my previous blogs on the matter.
One of the things that I missed dearly, but was not a critical priority, was getting streaming MP3 (shoutcast) on my headphones. Too many higher priority things on my plate, but I finally got XMMS and the MP3 codecs. What a pain those pesky patents have caused for end users like me.
977 the Kickin Country Channel never sounded so good!