Archive for the 'Professional' Category

bayon is back

Friday, October 26th, 2007

For readers who have been perusing since the early days of this blog (bayon blog) you’ll know what I’m talking about. If you’re a reader that has joined in the past year and half you’re probably wondering “What is bayon?”

bayon is a boutique consulting firm specializing in Business Intelligence implementations; it’s my company that I’ve operated since 2002. I put it on the back burner when I put on a Pentaho jersey and played a few games on the Pentaho team. I’m leaving (actually, left) Pentaho. My time at Pentaho was great. The Pentaho tribe is a great group of kind, honest, smart people. Rare to find the intersection of good people and good technologists.

I’ve felt the siren call of helping customers in a more entrenched way. Consulting does that I think. So, not like it’s a big announcement, but it is belated as my last day at Pentaho was nearly two months ago:

I’m now working at bayon full time building a dedicated practice around Open Source BI technologies in the enterprise. Bayon has joined the Pentaho partner program as a Certified Systems Integrator.

So there you have it. Shingle is out.

If you are interested in Pentaho, Open Source ETL, Open Source BI, etc don’t hesitate to be in touch.

PS - It’s also worth noting that my leaving has no reflection on the progress of the business. Quite the opposite really; some would consider me foolish for leaving when the company is doing as well as it is!

Well, I noticed

Tuesday, June 26th, 2007

My workday goes much smoother because I listen to a variety of Online Radio stations. Today they all went silent or played public awareness campaigns from SaveNetRadio.org. People have been wondering if anyone will notice. Well, like the title says: I noticed!

I don’t know all the mechanics, but it comes to this. A lot of these small, boutique-ish online radio stations will shut down because the cost structure of the compensation will be, in their opinion, unsuitable.

I, for one, being a proponent of open content, software, and standards think there must be some underlying disconnect between the “Copyright Royalty Board” and broadcasters.

These small, hobbyist online broadcasters are part of larger shift in broadcasting/economies. Web 2.0-ish user generated content and participatory systems of consumer and producer.

The CRB probably needs to take another look at what it’s doing to see if it’s just trying to hang on to old ideologies in a new world.

I support Net Radio. :) You should too.
SaveNetRadio.org

Pet Peeve: EST != EDT

Friday, June 22nd, 2007

I work with people all over the country and the world. What that means is that we often schedule meetings, calls, webex meetings, remote consulting sessions, etc. Lacking some great shared calendar in the cloud that we can use to do this adhoc (I’m sure there’s some web 2.0 startup who does this so please comment if you know of something GOOD) this means that people email and put suggested and adjusted times in emails.

For instance, just yesterday, I received the following email:

The regular 10am EST XYZ meeting tomorrow is cancelled until further notice.

What’s the issue with this email? Well, we don’t have a 10am EST meeting. We have a meeting scheduled at 10am Eastern (ie, when the clock in the eastern time zone hits 10am during the summer months).

EDT and EST have VERY SPECIFIC timezone offsets.

EDT = UTC - 4
EST = UTC - 5

I use generally, and think many others also use “Eastern” to refer ONLY to local time. Ie, what the clock on the wall says in New York regardless of EDT/EST.

Let’s take the above example:

  • 10:00 EST on June 22 (someone sends an email requesting a meeting)
  • 10:00 EST = 13:00 UTC (given the definition of EST, with an offset of -5 hours)
  • 13:00 UTC = 11:00 EDT (ie, makes sense right, 10:00 EST = 11:00 EDT)
  • 11:00 Eastern = 10:00 EST (on June 22 when New York is in EDT the actual meeting time)

Obviously you assume that someone requesting a meeting for 10am EST on day that falls on EDT was ACTUALLY requesting a meeting at 10am EDT. However, why bother doing that?

My suggestion to people that can’t keep it all straight:

Use Eastern/Pacific instead of EDT/PSTs. Eastern/Pacific is clear that it’s local time but you haven’t confused it by requesting an incorrect time.

Why “web 2.0″ works

Monday, April 2nd, 2007

There’s an intersection of value at
a) products that are web service and plugin enabled
b) companies that provide interesting “net effect” services

For instance, tonight my hosting provider, Dreamhost, emailed and said that my feed was being hit so extensively that I was causing service interruptions on the other accounts on the machine. 

First, to the other sites, and I have no idea who you are but I’m sorry!

Second, I was able to leverage a web 2.0 - ish service and plugin to instantly alleviate the pain (with a cached redirected version of my feed) and now I’m getting some cool extra stats on my RSS (who/how/what).  All in < 30 minutes.

Because my blogging software works in the networked world, things happen easier and more naturally than me having to hack around a bunch of code special scripting/patching on my website.  Plugins/Service/etc.  It’s a grand new world.

PS - I guess I’m over that whole “bloggers block” thing.  Spouting out useless crap on my blog again.

On bad things happening to good people

Friday, November 17th, 2006

My friend Mark Rittman recently lost his entire library of 700 blogs/articles/etc.  He’s handling it with SOOO much grace; testament to him as a gentlemen and all around great guy.  I know I personally would be furious, bitter, and livid (at least for a few weeks).

The worst part about the whole deal, the hosting company is unapologetic enough to state:

Customers who have their own backups will be able to restore their own
data. Our terms and conditions advise customers to have their own
backups in case there is a catastrophic loss. This is the first time we
have suffered such a loss.

Mark, I’m soooo sorry.  Here’s hoping you get some of it back (BI Blogs, OraBlogs, etc). 

Pentaho tops 4.4 Million USD in open source code

Thursday, November 16th, 2006

Startups are interesting: Some days you love your job, other days you want to throw yourself out the window.  Yesterday I hated my job, for a variety of reasons.  One of the things that cheers me up when I’m in the midst of some tough stuff is looking at the big picture.

To date, Pentaho has built more than 313,981 lines of open source code.  It’s an estimated 81 person years.  At 55,000 USD / year for a developer that roughly equates to about 4,400,000 USD of “code” built and released under a business friendly, OSI approved, open source license (MPL).

WOW!

We put the vast majority of our stuff into the open source project (more than 80%); it’s a complete product in and of itself and that’s something I personally am proud of.  I’ve added the “OHLOH” badge for Pentaho to the upper right hand corner so there’s a ticker on this page to keep track of the breadth, size, and investment in the open source edition of Pentaho:

Incidentally, the metrics are calculated by a very cool upstart ohloh.  They slurp data from source control systems and display cool metrics about projects, like ours.  Check them out!

Google Images turns up a strange image

Monday, August 21st, 2006

Someone pointed out the following image turned up by Google when searching for Pentaho

It comes from an article about my joining Pentaho and seems to indicate that I’m a “feather in Pentahos cap.”  I think a graphic designer just has too much time on their hands.  Definitely.  :)

Last time Windows crashes on me

Wednesday, July 12th, 2006

End of last week, Windows was kind enough to give me the annual “Blue Screen of Somehow I Screwed Up My Own Internals I Hope You Weren’t Doing Any Real Important Work Because You’ll Have to Reinstall the Operating System of Death.”  Gasp.

We’ve all been there.  What really bugged me is that when it happened, I sighed and just thought to myself that this is “the price of computing.“  This had become normal and acceptable to me… Then I shook myself a bit and became determined to rid myself, as much as possible, of the OS from Redmond.  No offense; I love Excel, think there’s some great usability in there, but it’s just not my cup of tea.

Eventually I’ll end up with a Macbook Pro; I feel the call of the siren as much as anybody.  Until then, I’m on Suse 10.1 desktop and so far I’m quite pleased. 

I’ll blog again later along on the specifics of the setup, but I’ll just say that the XGL desktop is both wicked COOL and very functional.

Free, Valuable, DW Wisdom from man in lilac suit

Thursday, June 29th, 2006

For those that don’t know the reference between lilac suit and Pete-s, just google it.  :)  Doesn’t really matter though when compared with the great set of articles that my friend Pete-S is pumping out from the other side of the Atlantic.

Pete has in the trenches practical knowledge building BI and DW systems.  He’s both sharp, and practical (that’s rare you know!).  He’s running a series DW Wisdom and it’s some very useful content.

DW Wisdom

I like that Pete is comfortable enough with his own skills/abilities to question the "age old wisdoms" of DW.  Even if they are found to be true it’s good to see some real scientific "assumption breaking" to prove/disprove reality.

  • Use as many small disks as possible – a 1TB disk would be a bad idea for a system that inherently reads large volumes of data, everything would go through a single IO point.

    • Keep all the OLTP tables separate from the DW systems; OLTP has lots of small, fast transactions, DW has slower, big reads. DW loves bitmap indexes, OLTP hates them.

    • Use high degrees of parallel processing

But are these truths still valid?

DI Wisdom (2) - more of the physical

Comparison of OLTP vs DW(OLAP-esque):  Great reference table:

DI Wisdom (3) - departmental business

When people found that their transactional systems were unsuited for BI reporting (perhaps because of the performance impact of running BI on a transactional system, or the transactional system did not hold all of the data required for reporting) they started to look towards dedicated data warehouses.

DI Wisdom (4)

Enterprise DW moves away from the tactical departmental “point solutions” and into something that fits with strategic aspirations of the enterprise. On the face of it having a single solution across the enterprise as distinct advantages:

  • there is a single, consistent model of enterprise data

    • there is less duplication of data across the enterprise
    • it is possible to construct a security model such that the right people see all of the data that allows them to their jobs but not the information that is too sensitive for their job role
    • the origins of all of the business data can be traced back to source

In fact these aims are so laudable that they have been hijacked by other IT disciplines such as master data management, risk and compliance management, and business process reengineering.

DW Design (part 1)

I can’t agree with Pete more: A staging, 3NF warehouse, and then presentation layer (marts) I think is a very practical way to seperate concerns, and avoid tight coupling between source and reports.  A la Corporate Information Factory.

For a long while I have favoured a three section data warehouse design: a staging area where raw fact and reference data is validated for referential integrity, a third-normal form layer to hold the reference data and historical fact, and finally a presentation layer to hold denormalised reference data and aggregated fact. The staging layer is ‘private’ to the data warehouse but user query access (subject to business security rules) to other layers is permitted. In some cases it will not be possible to use a denormalised layer; but if you can use one, you should.

DW Design (2) – staging data

As mentioned yesterday, the staging area of the data warehouse has three functional uses:

  • It is the initial target for data loads from source systems

    • It validates the incoming data for integrity

    • It is the data source for information to be published the ‘user visible” layers of the data warehouse

Optionally, it may also be where the logic to transform incoming data is applied.

Great series Pete!  Now if only this were in a book that I could tell all my blog readers and colleagues to purchase!  :)

Trying Qumana

Thursday, June 1st, 2006

I recently moved to Wordpress as a blogging engine.  So far I’ve liked it very much, and the WYSIWYG editor in the web browser is pretty good.  Much better, in my opinion than the web based  Movable Type editor.

That being said, I reall do like being able to blog quickly and efficiently from my desktop.  Screen captures, text, copy and paste URLs, all make blogging feel much more like a treat; a thing you get into the rhythym of instead of taking time out of the day to get things "ready to go."

Now, if I ever decide to go Linux flat out I’ll have to do this all over again I suppose.  Until then, I think I rather like it.

One thing I think is missing from nearly all of these desktop blogging products is the ability to do real copy and paste.  Everything has to already be in a file.  For quick screen captures to blog I’d LOVE for a tool to just accept clipboard data and make it into a file.  Anyone know of any product that does that?