Monthly Archives: August 2006

Overview of Business Intelligence / DW

Back in May Dan Morgan was kind enough to invite me to do a guest lecture at the University of Washington about “Data Warehousing Basics.”  After having emailed these slides as a decent overview to a few customers lately, I realized they’d probably be useful online.  It is obviously a little light on content (their just slides) but they do provide some good “high level views” of dimensional modeling/DW/BI in general.  My employer, Pentaho, was generous enough to allow me time to build this presentation for the students at UW for which they, and I are grateful.  THANK YOU!

The online version: Univerisity of Washington Guest Lecture May 9
The PDF of the presentation:  University_of_Washington_Guest_Lecture_May_9.pdf

My cliffs notes:
 - If you’re doing a BI or DW project find “that guy” with the MS Access database or Excel jockey that sits outside the COO’s office.  Make him your BEST BEST friend.
 - Facts are “What” / Dimensions are “How”  Good graphics that drive the point home:

and

Thanks to Matt Casters for leeting me pillage some of his graphics and slides.

Pentaho Tech Tips: Call for prioritization

Open source is democratic, open, real.

While I have a good sense for which Tech Tips would be useful, I’d also like to ask the community for what tips they’d like to see written up:

  • Mondrian: Star Schema to OLAP cubes
    A very basic Star Schema with a Fact and Two Dimensions show how this is built into a Mondrian cube and how to built a “Pivot view” Pentaho report.
  • Mondrian: Advanced MDX
    Sets, top, running totals, etc
  • Kettle: Portable ETL
    Showing how to use paramater injection to make your Kettle solution (Jobs and Transforms) executable inside of Pentaho.
  • Kettle: Custom rollups using Excel
    Showing how to build a dimension, reporting table, etc using a very easy to use interface for business users.
  • Reporting: List of Values
    Show how to use the most unfortunately named Secure Filter component to do list of values (even though you are not REQUIRED to do any security).  Not very eloquent but the suggestion has been to call it a “Prompt For” component (see below).  Think “parameter page” driven by “select distinct name from my_reporting_table.”
  • Report Designer: How to build reports with Charts
    The latest release included the charting expressions so now one can build reports with lovely looking charts.
  • Report Designer: How to pass “Pentaho” parameters to reports
    This allows the building of drill thru parameters, titles, and other “context” from the server
  • Pentaho Spreadsheet Services: Your data looking sexy in Excel
    A quick how to of how to get an instant excel analytic interface into ANY database.  Example with Oracle XE.

Comments are ON… vote, have your say.  I WANT to do all of these, and will, eventually.  What do YOU want to see?

Pentaho Linux .sh files

Small little tip:

The pentaho build process doesn’t currently manage the permissions on .sh files properly.  When you download the daily builds or other demo installations you may get some errors (bash command not founds, etc).  You need to change to executable all .sh files in the installation.  Use the following command in the “pentaho-demo” directory.

for x in `find . -name ‘*.sh’`; do chmod +x $x; done

Hope you find this helpful!

Open Source is agile

I’m not talking about the methodology in particular, I’m just saying compared to traditional software engineering practices with customer advisory boards vetting major features, rounds of marketing approvals of features, etc.

For instance, I submitted a Jira case to the Pentaho development staff for including a jar in our demo application need to run certain Pentaho Data Integration mappings.  In 20 hrs the jar had been included (already vetted for license since it’s part of another project) and is now part of the daily builds.  This is the oil that makes the open source machine great; ability for software (Pentaho as a project) to respond to real customer needs (from me).  It’s awesome!

Now that reminds me, I hadn’t highlighted some of the cool new “open source — eee” things at Pentaho yet:

  • Public Issue/Feature Roadmap:
    We have launched Jira as a place to track new feature requests, bug submissions, etc.  I greatly encourage you to register and begin using it to submit bugs / suggestions.  Can’t always say they’ll get fixed in 20 hours but they have a MUCH GREATER chance of being fixed if they’re in Jira in addition to the forums.
  • Public Source Control:
    While we’ve always published our source with every release that source repository wasn’t available to anyone on an anonymous basis.  We’re hosting a subversion now that allows easier access and contribution from our always valued community.  Consider this an open invitation to dig in, build a cool plugin, etc.

I’m glad these two things have happened; I think it just makes communication easier, effective, and more transparent.  What do you think?

Finally, not in lame-oh music devoid desktop

I’ve recently made the switch to Linux as many of you have read my previous blogs on the matter. 

One of the things that I missed dearly, but was not a critical priority, was getting streaming MP3 (shoutcast) on my headphones.  Too many higher priority things on my plate, but I finally got XMMS and the MP3 codecs.  What a pain those pesky patents have caused for end users like me. 

977 the Kickin Country Channel never sounded so good!