DynamoDB: Time Dimension table with MERGE

So, even with my disclaimer note on the last blog, the DynamoDB developers slapped me around a bit for suggesting using a view for a Time Dimension. The Time Dimension is the most important dimension table and should be an actual table, not a view. Creating the table allows us to perform all kinds of optimizations like star joins, bitmap indexes on attributes, etc. Probably wouldn’t be that big of a deal for a tiny fact table (< 5million records) but you’ll want CREATE TABLE if you want good performance.

Good news is, that we can use our exact same table function (with fiscal year offset)

select * from table(applib.fiscal_time_dimension (2000, 1, 1, 2009, 12, 31, 3))

to populate and keep our Time Dimension TABLE up to date.

If you use a TABLE, it’s 2 steps:

CREATE TABLE : “dim_time”
POPULATE TABLE : “merge into dim_time”

We’ll be using another great tool in the DynamoDB / LucidDB toolkit, the MERGE statement. The MERGE statement is a logical UPSERT. It checks to see if key is already present. If it is, we UPDATE the table. If it isn’t, we INSERT it into the table. I’ll go into more detail at some point in the future as MERGE is crucial for keeping dimensions up to date.

Let’s create our Time Dimension table:

create table dim_time (
FISCAL_YEAR_END_DATE DATE
, FISCAL_YEAR_START_DATE DATE
... ABBREVIATED ...
, TIME_KEY DATE
, TIME_KEY_SEQ INTEGER
,constraint dim_time_pk primary key (day_from_julian));

NOTE: We’ve abbreviated the statements, but all the columns are used in the actual scripts. We also should add bitmap indexes on YR, MONTH, etc columns.

We’ve now got a TABLE that matches the VIEW we created in the previous blog. We’ve made day_from_julian as our PK, and we’ll use this date as our key for the MERGE statement. We can run this query as many times as we like and it will always just keep our “dim_time” table up to date.

merge into dim_time using (select * from
      table(applib.fiscal_time_dimension (2000, 1, 1, 2010, 12, 31, 3))) src
on dim_time.day_from_julian = src.day_from_julian
when matched then UPDATE set
FISCAL_YEAR_END_DATE=src.FISCAL_YEAR_END_DATE
,FISCAL_YEAR_START_DATE=src.FISCAL_YEAR_START_DATE
... ABBREVIATED ...
,TIME_KEY=src.TIME_KEY
,TIME_KEY_SEQ=src.TIME_KEY_SEQ
when not matched then INSERT
(FISCAL_YEAR_END_DATE
 , FISCAL_YEAR_START_DATE
... ABBREVIATED ...
 , TIME_KEY
 , TIME_KEY_SEQ)
values(
src.FISCAL_YEAR_END_DATE
 , src.FISCAL_YEAR_START_DATE
... ABBREVIATED ...
 , src.TIME_KEY
 , src.TIME_KEY_SEQ);

One thought on “DynamoDB: Time Dimension table with MERGE”

jcamacho April 7, 2010 at 12:59 am

Nice blog Nick,

I followed your post regarding the “automagic” time dimension view in DynamoBI/LucidDB. It´s a great feature of the product. Do you know if it supports localization? I would like to generate this view using the Spanish formats for dates. In my tests I didn´t achieve this, so I think the JAVA code behind the routine doesn´t take the system environment localization into account. So, do you know any workaround for this? Perhaps modifying the routine code, or executing another code before the time_dimension app?

Thank you in advance. Bookmarked your blog 🙂

Reply ↓

Goodman on BI

Thoughts on Open Source, Analytics

DynamoDB: Time Dimension table with MERGE

One thought on “DynamoDB: Time Dimension table with MERGE”

Leave a Reply Cancel reply