The death of prevRow = row.clone()

UPDATE: This step is available in Kettle 3.2 M1.

For those that have done more involved Kettle projects you’ll know how valuable the Javascript step is. It’s the Swiss Army knife of Kettle development. The calculator step is a nice thought, but the limited set of functions and the constriction of having to enter it in pulldowns can make more complex calculations more difficult.

Those that have done “observed metric” type calculations in Kettle will know this bit of Javascript well:

var prevRow;
var PREV_ORDER_DATE;

if ( prevRow != null && prevRow.getInteger(“customernumber”, -1) == customernumber.getInteger() )
PREV_ORDER_DATE = prevRow.getDate(“orderdate”, null);
else
PREV_ORDER_DATE = null;

prevRow = row.Clone();

This little bit of Javascript allowed you to “look forward” (or back depending on your sorting) and calculate the difference between items:

  • Watching a set of “balances” fly by and calculate the transactions (this balance – prev balance) = transaction amount
    Web Page duration (next click time – this click time) = time spent viewing this web page
    Order Status time (next order status time – this order status time) = Amount of time spent in this order status (warehouse waiting)

In other words, lining data up and peaking ahead and backwards is a common analytic calculation. In Oracle/ANSI SQL, there’s a whole set of functions that perform these type of functions.

This week I committed to the Kettle 3.2x source code a step to perform the LEAD/LAG functions that I’ve had to hand write several times in Javascript. It’s been long overdue as I told Matt I designed the step in my head two years ago and he’s been patiently waiting for me to get off my *ss and do something about it.

You can find more information about the step on its Wiki page, along with a few examples in the samples/transformations/ directory.

The step allows you peek N rows forward, and N rows backward over a group and grab the value and include it in the current row. The step allows you to set the group (at which to reset the LEAD/LAG), and setup each function (Name, Subject, Type, N rows)
200901301239
Using a group field (groupseq) and LEADing/LAGing ONE row (N = 1) we can get the following dataset:
200901301238
Any additional calculations (such as the difference, etc) can be calculated like any other fields.

This was my first commit to the Kettle project, and a very cool thing happened. I checked in the base step and in true open source fashion, Samatar (another dev) noticed, and created an icon for my step which was great since I had no idea what to make as the icon. Additionally, hours after my first commit he had included a French translation for the step. He and I didn’t discuss it ahead of time, or even know each other. That’s the way open source works… well. :)

RIP prevRow = row.clone(). You are dead to me now. Long live the Analytic Query step

Leave a comment ?

10 Comments.

  1. Heads up that with 3.2.0.M1 adding the Analytic Query to a transformation saved in the repository prevents that transformation from being opened from the repository :( So be careful.

  2. is there a good free french translation tool on the internet ?..*

  3. Pretty impressive post. I just came across your site and wanted to say that I have really enjoyed reading your blog posts. Any way I’ll be coming back and I hope you post again soon.

  4. It’s a very informative writing! I am so happy I found your page. Keep up the extrordinary work. Have a good day!mail forwarder

  5. That it was great to study using your article. I really appreciated the couple of minutes which i put in reading it and want to leave a comment to express that….Best wishes

  6. strongzz Hello There. I found your blog using msn. This is a very well written article. I will be sure to bookmark it and return to read more of your useful information. Thanks for the post. I will certainly comeback.

  7. Great ¨C I should certainly pronounce, impressed with your website. I had no trouble navigating through all the tabs and related info ended up being truly simple to do to access. I recently found what I hoped for before you know it at all. Quite unusual. Is likely to appreciate it for those who add forums or something, site theme . a tones way for your customer to communicate. Nice task..

  8. It is extremely helpful for me. Big thumbs up for this weblog post!

  9. Hey! Do you use Twitter? I’llike to follow you if that would be ok. I’m surely enjoying your blog and look forward to new updates.

  10. Hmmm, I actually like your posts, but how can I add your site to my favorites?

Leave a Comment


NOTE - You can use these HTML tags and attributes:
<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>