Archive

Archive for the ‘Professional’ Category

Pentaho tops 4.4 Million USD in open source code

November 16th, 2006

Startups are interesting: Some days you love your job, other days you want to throw yourself out the window.  Yesterday I hated my job, for a variety of reasons.  One of the things that cheers me up when I’m in the midst of some tough stuff is looking at the big picture.

To date, Pentaho has built more than 313,981 lines of open source code.  It’s an estimated 81 person years.  At 55,000 USD / year for a developer that roughly equates to about 4,400,000 USD of “code” built and released under a business friendly, OSI approved, open source license (MPL).

WOW!

We put the vast majority of our stuff into the open source project (more than 80%); it’s a complete product in and of itself and that’s something I personally am proud of.  I’ve added the “OHLOH” badge for Pentaho to the upper right hand corner so there’s a ticker on this page to keep track of the breadth, size, and investment in the open source edition of Pentaho:

Incidentally, the metrics are calculated by a very cool upstart ohloh.  They slurp data from source control systems and display cool metrics about projects, like ours.  Check them out!

Open Source, Pentaho, Professional

Google Images turns up a strange image

August 21st, 2006

Someone pointed out the following image turned up by Google when searching for Pentaho

It comes from an article about my joining Pentaho and seems to indicate that I’m a “feather in Pentahos cap.”  I think a graphic designer just has too much time on their hands.  Definitely.  :)

Pentaho, Professional

Last time Windows crashes on me

July 12th, 2006

End of last week, Windows was kind enough to give me the annual “Blue Screen of Somehow I Screwed Up My Own Internals I Hope You Weren’t Doing Any Real Important Work Because You’ll Have to Reinstall the Operating System of Death.”  Gasp.

We’ve all been there.  What really bugged me is that when it happened, I sighed and just thought to myself that this is “the price of computing.“  This had become normal and acceptable to me… Then I shook myself a bit and became determined to rid myself, as much as possible, of the OS from Redmond.  No offense; I love Excel, think there’s some great usability in there, but it’s just not my cup of tea.

Eventually I’ll end up with a Macbook Pro; I feel the call of the siren as much as anybody.  Until then, I’m on Suse 10.1 desktop and so far I’m quite pleased. 

I’ll blog again later along on the specifics of the setup, but I’ll just say that the XGL desktop is both wicked COOL and very functional.

Open Source, Professional

Free, Valuable, DW Wisdom from man in lilac suit

June 29th, 2006

For those that don’t know the reference between lilac suit and Pete-s, just google it.  :)  Doesn’t really matter though when compared with the great set of articles that my friend Pete-S is pumping out from the other side of the Atlantic.

Pete has in the trenches practical knowledge building BI and DW systems.  He’s both sharp, and practical (that’s rare you know!).  He’s running a series DW Wisdom and it’s some very useful content.

DW Wisdom

I like that Pete is comfortable enough with his own skills/abilities to question the "age old wisdoms" of DW.  Even if they are found to be true it’s good to see some real scientific "assumption breaking" to prove/disprove reality.

  • Use as many small disks as possible – a 1TB disk would be a bad idea for a system that inherently reads large volumes of data, everything would go through a single IO point.

    • Keep all the OLTP tables separate from the DW systems; OLTP has lots of small, fast transactions, DW has slower, big reads. DW loves bitmap indexes, OLTP hates them.

    • Use high degrees of parallel processing

But are these truths still valid?

DI Wisdom (2) - more of the physical

Comparison of OLTP vs DW(OLAP-esque):  Great reference table:

DI Wisdom (3) - departmental business

When people found that their transactional systems were unsuited for BI reporting (perhaps because of the performance impact of running BI on a transactional system, or the transactional system did not hold all of the data required for reporting) they started to look towards dedicated data warehouses.

DI Wisdom (4)

Enterprise DW moves away from the tactical departmental “point solutions” and into something that fits with strategic aspirations of the enterprise. On the face of it having a single solution across the enterprise as distinct advantages:

  • there is a single, consistent model of enterprise data

    • there is less duplication of data across the enterprise
    • it is possible to construct a security model such that the right people see all of the data that allows them to their jobs but not the information that is too sensitive for their job role
    • the origins of all of the business data can be traced back to source

In fact these aims are so laudable that they have been hijacked by other IT disciplines such as master data management, risk and compliance management, and business process reengineering.

DW Design (part 1)

I can’t agree with Pete more: A staging, 3NF warehouse, and then presentation layer (marts) I think is a very practical way to seperate concerns, and avoid tight coupling between source and reports.  A la Corporate Information Factory.

For a long while I have favoured a three section data warehouse design: a staging area where raw fact and reference data is validated for referential integrity, a third-normal form layer to hold the reference data and historical fact, and finally a presentation layer to hold denormalised reference data and aggregated fact. The staging layer is ‘private’ to the data warehouse but user query access (subject to business security rules) to other layers is permitted. In some cases it will not be possible to use a denormalised layer; but if you can use one, you should.

DW Design (2) – staging data

As mentioned yesterday, the staging area of the data warehouse has three functional uses:

  • It is the initial target for data loads from source systems

    • It validates the incoming data for integrity

    • It is the data source for information to be published the ‘user visible” layers of the data warehouse

Optionally, it may also be where the logic to transform incoming data is applied.

Great series Pete!  Now if only this were in a book that I could tell all my blog readers and colleagues to purchase!  :)

General BI, Professional

Trying Qumana

June 1st, 2006

I recently moved to Wordpress as a blogging engine.  So far I’ve liked it very much, and the WYSIWYG editor in the web browser is pretty good.  Much better, in my opinion than the web based  Movable Type editor.

That being said, I reall do like being able to blog quickly and efficiently from my desktop.  Screen captures, text, copy and paste URLs, all make blogging feel much more like a treat; a thing you get into the rhythym of instead of taking time out of the day to get things "ready to go."

Now, if I ever decide to go Linux flat out I’ll have to do this all over again I suppose.  Until then, I think I rather like it.

One thing I think is missing from nearly all of these desktop blogging products is the ability to do real copy and paste.  Everything has to already be in a file.  For quick screen captures to blog I’d LOVE for a tool to just accept clipboard data and make it into a file.  Anyone know of any product that does that?

Professional

Move to Wordpress

May 1st, 2006

I’ve made the switch from Movable Type to Wordpress.  I’ve used Movable Type for more than two years now, quite happily actually, but thought it high time to move to Wordpress.  I’m rather impressed with Wordpress thus far; very competitive product all in all.  Some things in the installation were even EASIER than MT.

I turned on comments/trackbacks on my MT blog approximately 7 weeks ago.  In that 7 weeks, I had received more than 2,000 spam comments, trackbacks, etc.  I plan to use a spam service on this blog to prevent such an abuse again.

If you are encountering any feed or viewing issues please do let me know.

Professional

01:02:03 04/05/06 is this week

April 3rd, 2006

Picked up on this one from Brad Felds blog (VC in beautiful Boulder, home to my alma mater).

This Wednesday (or Tuesday in Europe) will have a sequential read on the clock.

Professional

Off Topic: I miss casual Fridays

March 10th, 2006

Working from home, and at client sites, I don’t get the chance to enjoy the “casual Friday atmosphere” at software companies.

Someone sent me their office winner of the Ugly Shirt Award for Friday, March 10th. :)

Reader comments are open… Ugly, Classy, or Who Cares?

Professional

Comments are back on…

March 9th, 2006

I’m not sure how long it will last until some robot computer across the world gets wise to this and starts spamming the comments.

We’ll see!

Professional

Ohhhh… Rocketboom

March 3rd, 2006

Rocketboom is beamed to my Tivo regularly. Cool feature and I think it’s a glimpse into the future as well. No cable company, no networks, just content provider and an API to zip it to my Tivo.

Anyhow todays Rocketboom was both odd, and amusing. Usually it’s all business, but crazy day today. :)

Professional