Pentaho Visit : Day 2

Heading this morning from the hotel to the Pentaho was bright, sunny, and a beautiful day in Orlando. News from Seattle was 23 days straight of rain, but that doesn’t dampen my desire to head home the end fo the week. I really love Seattle and always look forward to returning.

That being said, I rather enjoyed today working with the folks at Pentaho. Today we got into some of the details of their solution, and much of the material and documentation started to make more sense based on their plain english explanations of what pieces of the platform fit where.

At a higher level, nearly everything in the Pentaho platform executes as an “Action Sequence.” This has some significant architectural benefits that we won’t belabor here, but suffice to say that this allows for great flexibiliy in deployment options. Actually, the three products below all interacted with these Action Sequences using a different “application” method (eclipse plugin, standalone java app, and a JBoss deployed web app) all drawing from the same core libraries. At a fundamental level, the Pentaho server is metadata driven (not in database metadata) in that the Pentaho base simply implements solutions defined by a variety of XML Documents. Nothing new here (this is common I’d say, anyone feel differently?) but a good choice all the same.

What is an Action Sequence? A sequenced and paremeter driven set of Action Definitions (ie, run report, genenerate PDF, email to Joe)

What is an Action Definition? A configured instance of an Action Component (ie, EMAIL=Joe, SUBJECT=Aug 2006 Sales Report, component “Email Sender”)

What is an Action Component? An implementation of an activity in the system (ie, an Email Sender component, PDF Generation Component, etc).

The AS is the driving class (for x in resultSet, do AD1, AD2, AD3). The AD is the specific instance of a call to a component (AD1 with Email=Joe, smtpserver=mail.bayontechnologies.com, etc). The AC is the Java class that implements the interface for components (public class EmailSenderComponent implements WhateverPentahoInterfaceItActuallyIs).

OK… now that the basics are out there, let’s talk specifics. Today we got to dig into three major pieces of the system:

Report Server
This is the piece that runs on a server somewhere, that executes Pentaho solutions. It has a web interface for interacting with the Pentaho server, it has a runtime repository for running these solutions, it is able to schedule Action Sequences to be run. This is similar to Crystal Enterprise Server that receives the reports and schedules, runs, distributes them. The demo installation of this is wicked easy on Windows, as I alluded to yesterday. The installation on Linux did require some coaxing, as do many things in Linux. Clearly, the out of the box implementation works best with Windows and Linux requires some effort so I’ll ding Pentaho for that. However, I can’t really fault them because 90% of evaluators will be giving them their 10 minutes of eval time with Windows boxes so it’s a good decision for the project/company.

Some screenshots of the working server, with a handful of the reports that come with the installation:

Notice the Parameter Driven selections. We’re getting into those tommorrow, but it looks promising.

The Crosstab JFreeReport is quite limited.

There is some cool dashboard stuff too… We’ll be getting into that later along.

Pentaho Report Wizard
This is a standalone Java application that makes the basic sequence of getting a basic report “running” pretty quick. It’s in its infancy; in fact, I don’t believe that it has been released to sourceforge.net yet but I might be mistaken on that. Pentaho is planning to release this on sourceforge in a few days.

Step 1: Define your data location and your query.

Notice the lack of a query builder which will deter some from even considering this a useful wizard. Pentaho acknowledged they need that peice in here and will work at improving the wizard incrementally. However, there’s little you can’t copy and paste into the SQL area so it makes Step 1 quite powerful in getting at your data. In Oracle for example, consider the power of ‘SELECT MONTH, VALUE, GROUP, LAG (VALUE parition by GROUP order by MONTH) PRIOR_PERIOD from MY_VALUES_TABLE’. Good for the SQL Gurus.

Step 2: Select which columns are your items, which are to be “grouped” in the report.

Nothing really special here. At this point I notice that nothing in this application actually requires a fat Java Client. It’s all check boxes, select boxes, arrow buttons etc similar to some of the JSF ADF Faces components that Oracle puts out. This is a prime candidate for a community built AJAX wizard! Any takers on that one?

Step 3: Report Format Options
Missed that screenshot, oops.
This is where you can set page breaks at group boundaries etc. There is also some formatting options here with background color, justification, etc.

Step 4: Formatting Options.

Now, this is not really that advanced… It provides the “formatting options” for the wizard which doesn’t actually use a template. I believe Mike on the Pentaho team coined it the “Non-Template Template.” Basically, because you’re using a wizard you give it basic formatting things (group heading font, color, etc) and it will generate a template for you. You want more than that you gotta build your own. Incidentally, underneath the covers this wizard is building a JFreeReport defintion. Pentaho can build reports using JFreeReport, BIRT, and Jasper, however I think JFreeReport is what the team is using for the wizard.

Step 5: Preview

The options are generated by JFreeReport. Excel, PDF, and HTML. The PDF on linux blew up for me. The HTML worked all right. Didn’t try the Excel. I’ll wait for Open Document (just kidding). You’ll see, the JFreeReport actually generates a pretty decent report. Clearly this is not the “pixel perfect” solutions some commerical offerings have, but it’s really not bad.

Step 6: NICK’s BONUS STEP

You also need to save the report. 🙂 The natural last step of the process is to save the report (four files that constitute the report definition) to be used either the REPORT SERVER (above) or the workbench (below).

Pentaho Workbench:

The workbench is an Eclipse plugin that edits Pentaho solutions. Since their solutions are, for all intensive purposes, plain XML documents this is a good fit. My initial impressions are, well, just OK. Clearly this beats writing XML by hand based on their spec but it’s a pretty rough GUI when it comes to making sense out of the whole thing. This GUI will be efficient for those that know the underlying XML structures, the specifics of their Pentaho Action (Components/Sequences/Descriptors). However, for the person coming onto the platform it will be kind of daunting. Again, it should improve but for now it’s not exactly user friendly.

We used the workbench to take that report we created in the wizard and “parameterized” it. We made the SQL driven by one of our where clauses and bound it to a “request.PARAM1” item that Pentaho will set in it’s server environment. It was a little difficult understanding the context of what we did, but I have to say that when we copied it up to the server it worked brilliantly.

More tommorrow… Email me at the usual (on the right column)… I’ll also offer to bring any questions, especially those of Oracle community, to see if I can’t get those answered.

Leave a Reply

Your email address will not be published.