OWB PARIS CLICK, CLICK, CLICK

For those of you using Oracle Warehouse Builder currently you are aware of the involved nature of installing the product and getting to a point you can do “real ETL.” It was multiple phases (product/design rep/runtime rep/target schema) and multiple steps in each. Once you’ve done it, and know what you’re doing one can do it in a few hours if you’ve already planned out your release. If you’re new, or aren’t quite familiar with OWB it can take days or even weeks.

I downloaded the latest beta release to install today, and I’m still amazed at how much the new version has improved this process! Download, Install Wizard(3 steps), Getting Started Wizard(3 steps), BAMMM! You’re ready to start building your data warehouse (provided of course, that you have experience in ETL and data warehousing).

Kudos to the OWB product team for simplifying this process!

ORACLE WAREHOUSE BUILDER : WORKSHOP II

UPDATE : The next workshop will be held end of May! Classes are limited to 8 seats (very hands on) Email me to get registered…

Many of you know that I’ve been developing Workshop II, the follow on workshop to OWB : Workshop I. As always, it’s an intensive hands on workshop ensuring you come away with both SKILLS and knowledge. If any blog readers are in the greater Seattle area (or have always needed an excuse for a weekend here) consider taking this course. Email me directly if you’re interested, and I can help get you registered for the workshop
ORACLE WAREHOUSE BUILDER : WORKSHOP II

This course builds upon the basic skills in Workshop I, and students leave with enough knowledge to begin building their warehouse. Users progress beyond basic mappings and learn additional DML techniques. Additional ETL skills are developed and students learn how to use Set Operators, Custom PL/SQL, and leave with most of the logical operators to build their warehouse. OWB Execution is explored and students learn how to build Process Flows to sequence and monitor the execution of their ETL Mappings. Students learn how to deliver web interfaces to display their Runtime (DBAs) and Design (End Users) data. Students learn techniques for scheduling and executing OWB from SQLPlus (EM).

Course Objectives
After completing the course, students should:

  • Be able to build process flows, ensure that errors are properly detected and managed
  • Be able to deliver the Web Browser based applications (Runtime Audit Browser for DBAs, and Design Browser for End Users)
  • Understand and be able to Import and Export data from OWB (OWB Metdata and Discoverer Integration)
  • Understand the applicability of, and know how to use the following ETL operators (Splitter, SetOp, View, Transformations, Expressions)
  • Be able to perform common DML processing (Trunc/Ins, Update, Delete, Merge) in mappings
  • Understand how to, and be able to deploy custom procedures and
    functions
  • Know the composition and the applicability of the OWB provided
    transformations
  • Be able to create views, and materialized views within OWB
  • Know how to execute and monitor OWB mappings from SQL Plus/Enterprise Manager

Text and Materials
All materials are supplied to students at the beginning of each class.

Class Format
Classes consist of approximately 1.5 hours of lecture supported by slides and whiteboarding. The balance of the day, approximately 6.5 hours, is hands-on. Each students works on their own Linux server.

Prerequisites
To be successful students must be comfortable with basic Oracle skills. The following is a short list of the suggested but not required skills:

  • Be able to accomplish all the objectives in Workshop I. Attendance at Workshop I is not required, but suggested.
  • Understand what schemas are, how they relate to users
  • Understand basic DDL (create table, alter table, create sequence, create or replace package) and DML (select, insert, update)
  • Know how to connect to Oracle using both Localnames (tnsnames.ora) and JDBC (host/port/SID)
  • Students will benefit most when they have a basic understanding of dimensional modeling (Cubes/Facts/StarSchemas)

Nice URLs for HTMLDB? Fishing for help!

I think HTMLDB is quite a gem in the Oracle Database offering. It is a great data centric GUI building tool; if you need a quick GUI for an Oracle based application consider HTMLDB. I got on to HTMLDB when I used it to build an administrative tool for an OWB Data Warehouse for a large publisher in Boston. I did so because it was free and easy, but quickly became addicted!

Asktom runs on HTMLDB; obviously it scales! So, assuming you don’t mind vendor lock in (will only run on Oracle) for your database, HTMLDB can be used as a GUI for custom applications, intranet applications, and extranet applications. Rapid, productive, and FREE with the database! Excellent! It’s powerful enough to even build publicly accessible websites, like AskTom…

That’s where I’m getting into a pickle! Publicly accessible sites have some additional requirements. Scalability, check. Professional looking templates, check. URLs acceptable for bookmarking and search engine ranking, big screeching crashing halt!

For those unfamiliar with search engine indexing in general, here’s the summary:
Search Engines generally rank lower pages with identical content, if their content is determined by parameters to the right of the “?” In other words (and entirely without proof) the following AskTom page on “Fast Wild Card Searching” (http://asktom.oracle.com)

/pls/ask/f?p=4950:8:::::F4950_P8_DISPLAYID:37336026927381
would be ranked higher for some google searches if it’s URL was more like this:
/pls/ask/4950/8/fast_wc_srchs.html?p=::sess:NO

There’s an entire industry built around how to ensure that your valuable content is mated with people searching for it. It would be tough to sell a company on using a solution that will almost certainly decrease their ability to reach customers (google users) with the content they put into their application.

I’ve used mod_rewrite before on simple point solutions to help make a site google friendly. However there is some significant work to be done(maps of page IDs to page names, maps of application IDs to names, complicated regex, etc) to create a mod_rewrite solution to make HTML DB urls easier on the eyes, bookmarks, and better indexed by search engines.

Anyone facing similar issues? Anyone have any prior art they wouldn’t mind sharing with the community? Please email me (boy, I really need to get my comments working on my blog) and I’ll be happy to post any information on it here.

Gibberish Talks at Conferences

I thought this was notable mention, originally posted on slashdot. Have you ever attended a presentation at a conference that made absolutely no sense? In this case, some computer scientists were able to prove that one can get ENTIRELY GENERATED AND BOGUS PAPERS accepted for conference presentations.

Rooter: A Methodology for the Typical Unification of Access Points and Redundancy
Many physicists would agree that, had it not been for the congestion control, the evaluation of web browsers might never have occurred. In fact, few hackers worldwide would disagree with the essential unification of voice-over-IP and public-private key pair. In order to solve this riddle, we confirm that SMPs can be made stochastic, cacheable, and interposable.

BI OR DW CONSULTANT?

I’ve noticed some puzzled looks to the Title I’ve taken in my company (bayon) of “Principal Business Intelligence Consultant.” This usually comes when they realize that I’m what is more commonly called a Data Warehouse Consultant, Data Warehouse Architect, or Data Warehouse Developer. People tend to think of Business Intelligence Consultants as those who define the accounting rules and metrics that will appear in report X, or those that help develop and track the strategic intelligence metrics required by an organization. They don’t typically think these are the people that are installing and configuring the database software, or building the physical applications that will breathe life into those metrics and business rules.

I think this is intriguing, since traditionally Titles are supposed to say what you do. Our industry might be a bit carried away with a focus on tools (any big surprises here?) instead of capabilities and skills. If we base our most fundamental labeling (a Title is the individual equivalent of the company elevator speech) on what software we use, instead of what we do does that say something profound about our industry?

The CHEF does not describe themself as a Cuisinart/KitchenAid Practitioner, but rather as a “Modern Chef.’
A PARALEGAL uses the term for their card, not the fact they are a “ParaLegal Software Version 1.1 Operator.”
CARPENTERs are not called “Hammering and Sawing Consultants.”

Don’t get me wrong, the tools and methodologies you use are important. My bio says that I use Oracle(and am certified), which helps people know if I can “do” what I “do” in their IT ecosystem (they use Oracle). In fact, in some environments it might be the paramount concern:

“Call yourself whatever you like but if you know how to write PL/SQL that builds dynamic SQL using execute immediate then you are our PERFECT fit.”

It’s not just important, it’s very important. However, it’s just not what a Title should be.

Bringing actionable intelligence to people is what I do. I deliver salient intelligence to business stakeholders so they can make decisions faster, and with greater likelihood of beneficial results. That is what I do. The method I use to accomplish this is almost always a Data Warehouse built using a variety of tools (who am I kidding, it’s almost always Oracle). That is how I do it; an architecture and tools.

What I do (Business Intelligence), has value. How I do it (Data Warehousing), is a cost center.

Anyone think I’m missing the mark? Please send me thoughts on the subject. I really need to get comments going on the site.

btw, I’m considering changing my Title. 🙂 Customer is king, and I’m finding most think in the “how/tools” view of the world.

OWB SOURCES AND TARGETS SQL

There was a posting on the OWB OTN forum about how to build a report documenting mappings, their sources, and their targets. Patrick Goessens provided an OMBPlus script that fits the bill and should work (perhaps slightly customized) brilliantly. I love OMBPlus and find it to be a very useful addition to the Oracle Warehouse Builder product. I’ve noticed Patricks postings before, and he is in the minority of OWB developers who have command of a very powerful feature.

There might be circumstances when OMBPlus might not be a prefereable option. Not everyone will have OWB installed, or want to pay the additional license costs to just “report” on the metadata. The consumer of this information might not be a “person,” but rather another system or repository. In that case, coordinating the execution of an OMBPlus script, parsing and importing into an alternative application might be troublesome.

I’ve built a small SQL script that runs against the OWB design repository public views (an Oracle provided view into the design metadata). I ran this against the solution for the sample company for my OWB workshop and it reports correctly for it. Use it as a reference, but ensure it works for your actual metadata repository as I don’t claim that this is a complete solution (or even nicely written SQL).
sqplus design_rep/design_rep_password@DB

select
  distinct 'TARGET',
  comp.map_name,
  comp.data_entity_name,
  comp.operator_type
from
  all_iv_xform_map_components comp,
  all_iv_xform_map_parameters param
where
  lower(operator_type)
     in ('table', 'view', 'dimension', 'cube')
     and param.map_component_id = comp.map_component_id
     and param.source_parameter_id is not null
UNION
select
  distinct 'SOURCE',
  t1.c1,
  t1.c2,
  t1.c3
from
  (select
  comp.map_name c1,
  comp.data_entity_name c2,
  comp.operator_type c3,
  max(param.source_parameter_id) c4
from
  all_iv_xform_map_components comp,
  all_iv_xform_map_parameters param
where
  lower(operator_type) in
                           ('table', 'view', 'dimension', 'cube')
     and param.map_component_id = comp.map_component_id
 group by
comp.map_name, comp.data_entity_name, comp.operator_type) t1
where t1.c4 is null
order by 2,1

TARGETS are determined if any of their columns have a source parameter (ie, there’s been a line drawn into them on a mapping). SOURCES are determined if they do not have any source parameters (they don’t have any mapping lines coming “in”). I couldn’t see any special flag or marker to find the “one” target of a mapping, so it had to be inferred. I wonder if anyone from the OWB product team would like to comment on the validity of this logic?

Let me know how people get on with this script; especially if there’s any edge cases or revisions that augment it.

Open Source OLAP

Every month I review the web traffic reports for my blog, and I’ve always found something rather interesting. Even though I post more information about Oracle and OWB than any other subject, Google seems to send me more traffic from queries like “open source ETL” and “open source OLAP.” You know what they say, customer is king and you gotta give ’em what they want!

In other words, all I needed was just a teentsee weentsee bit of an excuse to take some time to really kick the tires of the open source OLAP server Mondrian.

Some basics… Mondrian is an open source OLAP server, written in Java. It implements an MDX engine, and also exposes an XML/A interface to clients. Mondrian uses a ROLAP architecture, and ends up issuing SQL statements to a JDBC data source to retrieve and calculate. Mondrian works with Access, MySQL, postgres, and Oracle. Refer to the Mondrian architecture pages to get some more information about the architecture.

My overall impressions were positive; it’s a good core set of functionality and performs rather well. Like any Open Source project it is an alphabet soup of supporting libraries, environment variables, generators, frameworks, and takes more than the usual 10 minute commercial product install. We’re not building a kernel here, but it’s not trivial to get the examples up and running.

Mondrian works closely (and appropriately) with an open source implementation of a JSP based Pivot and Charting project, JPivot. The demo for Mondrian includes an example with JPivot querying Mondrian and uses the well known Food Mart demo. I was expecting a bit less from JPivot and was pleasantly surprised that it’s actually rather functional (not commercial product easy to use, but really quite commendable).

JPivot allows for drilling down hierarchies (I don’t think you can use multiple hierarchies) and Pivoting and exhanging the columns and rows. It has a “CUBE” editor that allows you to edit the report. It’s not drag and drop, but definitely works if you “grok” the interface.

Also pleasantly surprising was some pretty decent charting capabilities.

There are some decent selections of charts

In order to provide an OLAP view of your data you have to define some metadata about your Dimensional model (Cubes, Measures, Dimensions) and how they map to your underlying Relational Schema. Check out the samples on the Mondrian site to see how to write your own schema.

Couple of interesting things to point out, Mondrian implements a cache of “relations” used to increase performance. This is interesting because of consistency questions (some fragments are cached, but others are current) but also because it is WICKED fast once it’s loaded into memory. There are some interesting possibilities here, including some work with some distributed P2P OLAP distributed caching research.

DISCOVERER MEETS DUKE NUKEM

Is there any better way to spend a Monday than to build a 3D Dashboard (using some sample reports from my upcoming OWB course)?

You can obviously see I’m not describing 3D report effects in the context of a web based reporting application, but rather a 3D environment that has the actual dashboard content displayed in 3D space. I had been giving a lot of thought recently to the constant craving of executives for BI dashboards. In fact I think some organizations might consider implementing such a solution because of the following two realities:

  1. Executives love web based dashboards, but what they really want is to have the NASDAQ MarketSite at their office. Screen upon screen of everything they need to know to make great decisions. Building out hundreds of LCD panels might not fly with shareholders though, and I don’t blame them as they are very expensive. With this sort of interface, one can still get the “control room” metaphor without the expense.
  2. The generation of “video game playing and we’ve been using computers since 2nd grade” are now being tapped into the highest C-level positions. It’s not uncommon for an executive to know how to play a little DOOM (even if it was more than a few years back). In other words, there are a significant number of executives that would not feel “uncomfortable” in front of a multiplayer game interface.

However, there are SIGNIFICANT limitations to a 3D space, not the least of which is the ability to handle 2D information quickly and effectively (a la web pages/actual applications). When I began the exercise I was certain that there would be no use in the project unless it could integrate back with EXISTING systems. There is no way one would be able to work effectively entirely in a 3D work, so it would have to play nicely with the current applications. I’ve done a very small integration with Oracle Discoverer where by clicking on one of the dashboard 3D objects you see Discoverer Viewer drilled to the sheet in question alongside the 3D world. Without something like this, where people could get to the “real work”, I think a 3D dashboard is pretty much a nonstarter.

One could take the metaphor further, and have virtual meetings in the control room. Here I’ve pictured the dashboard from the “3rd” person view instead of the “1st” person view. Consider a situation where the District Sales Manager calls the Senior VP of Sales and asks to “show him” some area of concern. He could literally take the SVP/Sales to the dashboard area and they could sift through the data together.

I sort of haphazardly found this 3D software, as I have a customer that used it (ActiveWorlds) as a prototype for a 3D community so their web site visitors could explore articles on their site, chat with each other, play 3D games, etc. While I was at that customer site I helped them integrate some “external media” into the 3D product, and found it rather straightforward and easy. I don’t mean to offend but I found the communities who play 3D games a bit strange and having little to no interest in the external application of their technologies. I found a myriad of resources on the legend and lore of ficticious 3D worlds (and rivalries between them) but few resources on how to “drill through with URL parameters.”

I’ve attached some screenshots in this article, and there is also a 12MB AVI file that demos the solution. I built it on a corporate system so I can’t offer up access to the outside world at this time. However, I greatly value the community traffic and opinions so if there’s enough interest perhaps I’ll see about “putting it up.” Email me if you’d like to actually play with it and I’ll see what I can do.
UPDATE: There were some issues with the CODEC for the AVI. If you had problems previously, feel free to try again.