Monthly Archives: June 2007

Well, I noticed

My workday goes much smoother because I listen to a variety of Online Radio stations. Today they all went silent or played public awareness campaigns from SaveNetRadio.org. People have been wondering if anyone will notice. Well, like the title says: I noticed!

I don’t know all the mechanics, but it comes to this. A lot of these small, boutique-ish online radio stations will shut down because the cost structure of the compensation will be, in their opinion, unsuitable.

I, for one, being a proponent of open content, software, and standards think there must be some underlying disconnect between the “Copyright Royalty Board” and broadcasters.

These small, hobbyist online broadcasters are part of larger shift in broadcasting/economies. Web 2.0-ish user generated content and participatory systems of consumer and producer.

The CRB probably needs to take another look at what it’s doing to see if it’s just trying to hang on to old ideologies in a new world.

I support Net Radio. 🙂 You should too.
SaveNetRadio.org

Pet Peeve: EST != EDT

I work with people all over the country and the world. What that means is that we often schedule meetings, calls, webex meetings, remote consulting sessions, etc. Lacking some great shared calendar in the cloud that we can use to do this adhoc (I’m sure there’s some web 2.0 startup who does this so please comment if you know of something GOOD) this means that people email and put suggested and adjusted times in emails.

For instance, just yesterday, I received the following email:

The regular 10am EST XYZ meeting tomorrow is cancelled until further notice.

What’s the issue with this email? Well, we don’t have a 10am EST meeting. We have a meeting scheduled at 10am Eastern (ie, when the clock in the eastern time zone hits 10am during the summer months).

EDT and EST have VERY SPECIFIC timezone offsets.

EDT = UTC – 4
EST = UTC – 5

I use generally, and think many others also use “Eastern” to refer ONLY to local time. Ie, what the clock on the wall says in New York regardless of EDT/EST.

Let’s take the above example:

  • 10:00 EST on June 22 (someone sends an email requesting a meeting)
  • 10:00 EST = 13:00 UTC (given the definition of EST, with an offset of -5 hours)
  • 13:00 UTC = 11:00 EDT (ie, makes sense right, 10:00 EST = 11:00 EDT)
  • 11:00 Eastern = 10:00 EST (on June 22 when New York is in EDT the actual meeting time)

Obviously you assume that someone requesting a meeting for 10am EST on day that falls on EDT was ACTUALLY requesting a meeting at 10am EDT. However, why bother doing that?

My suggestion to people that can’t keep it all straight:

Use Eastern/Pacific instead of EDT/PSTs. Eastern/Pacific is clear that it’s local time but you haven’t confused it by requesting an incorrect time.

Kettles secret in-memory database

Kettles secret in-memory database is

  1. Not actually secret
  2. Not actually Kettles

There. I said it, and I feel much better. 🙂

In most circumstances, Kettle is used in conjunction with a database. You are typically doing something with a database: INSERTs, UPDATEs, DELETEs, UPSERTs, DIMENSION UPDATEs, etc. While I do know of some people that are using Kettle without a database (think log munching and summarization) a database is something that a Kettle developer almost always has at their disposal.

Sometimes there isn’t a database. Sometimes you don’t want the slowdown of persistence in a database. Sometimes you just want Kettle to just have an in memory blackboard across transformations. Sometimes you want to ship an example to a customer using database operations but don’t want to fuss with database install, dump files, etc.

Kettle ships with a Hypersonic driver, and therefore, has the ability to create an in memory database that does (most) everything you need database wise.

For instance, I’ve created two sample transformations that use this in-memory database.

The first one, kettle_inprocess_database.ktr, loads data into a simple table:
200706202230

The second one, kettle_inprocess_database_read.ktr, reads the data back from that simple table:
200706202235

To setup the database used in both of these transformations, which has no files, and is only valid for the length of the JVM I’ve used the following Kettle database connection setup.

I setup a connection named example_db using the Generic option. This is so that I have full control over the JDBC URL.
200706202227

I then head to the Generic tab and input by URL and Driver. Nothing special with the driver class, org.hsqldb.jdbcDriver that is just the regular HSQLDB driver name. The URL is a little different then usual. The URL provided tells hypersonic to use a database in-memory with no persistence, and no data fil.e”
200706202225

Ok, that means the database “example_db” should be setup for the transformations.

Remember, there is NOTHING persistent about this database. That means, every time I start Kettle the database will have no tables, no rows, no nothing. Some steps to run through this example.

  1. Open kettle_inprocess_database. “Test” the example_db connection to ensure that I / you have setup the in-memory database correctly.
  2. Remember, nothing in the database so we have to create our table. In the testing table operator, hit the SQL Button at the bottom of the editor to generate the DDL for this smple table.
  3. Run kettle_inprocess_database and verify that it loaded 10 rows into testingtable.
  4. Run kettle_inprocess_database_read and verify that it is reading 10 rows from the in-memory table testingtable.

I should note that using this approach isn’t always a good idea. In particular there’s issues with memory management, thread safety, it definitely won’t work with Kettles clustering features. However, it’s a simple easy solution for some circumstances. Your mileage may vary but ENJOY!