2004 : DBMS_RECTIFIER, Instead of Triggers, ORA_HASH

As a recent transplant to Seattle, I’m keen on getting connected with my local community of Linux, Open Source, BI, and Oracle brethren. I took in my first Puget Sound Oracle Users Group meeting last night. The group draws between 10-50 people per week, appears well organized, and based on my observations from my first meeting, provides substantial substance in a peer-to-peer setting.

Daniel Morgan, from the UW extension school, presented three mini-lectures on DBMS_RECTIFIER, Instead of Triggers, and ORA_HASH. I highly recommend you visit these pages on PSOUG, they have examples that demonstrate the following features quite nicely.

DBMS_RECTIFIER is a rather interesting package, especially for someone who spends a great deal of my Oracle time building CDC routines for Data Warehousing. At a basic level, it’s two procedures that provide a quick method to easily identify differences in data tables, and correct those changes. The rectify has an interesting behavior, and it’s not particularly well suited to CDC. However, the differences procedure quickly identifies differences (can be scoped to only a few columns as well) and places the different ROWIDs in a seperate table. In a pure Oracle environment, a BI professional could make use of this since ROWID provides a very quick way to access rows.

Instead of Triggers apparently have been availabe since Oracle 8. Some views are updateable, if they match a certain criteria. However, those that do not match this criteria, there is still a way to allow for updates/deletes/inserts on these views. Oracle allows you to make a special trigger, an Instead Of trigger that only applies to Views. They allow someone to specify exactly the DML to occur, and can include procedure and function calls, and DML on completely different tables. This could be used to :

Audit all inserts/updates through a defined view to a seperate audit table. I know this is old news, but this is a pretty slick way to accomplish this.

Provide an interface where certain DML statements become impossible. One can provide a view to a table that limits the DML so that certain columns are never updated.

Interface with an external system. One could create views in Oracle that provide an interface to a remote system. Data inserted/updated/deleted from these views could send these inserts/updates/delete to an external system via anything you can write in PL/SQL.

ORA_HASH. Quick. Flexible. Hash Function. Useful for tons of things. If you know how to use HASHing, you’ll find this convenient. It too, has been in Oracle for ages. 🙂

OWB Runtime : How to purge audit data easily

The OWB runtime generates runtime audit data for every process, mapping, and file loading that it invokes. If you have intricate flows and load your warehouse/mart several times a day (every 5 minutes perhaps?) you’ll see these pile up quickly. The easiest user method for purging this audit data is through the OWB Runtime Browser that allows you to select (checkbox).

Sometimes you don’t want to manually do this, but rather have a way to do this programatically. Perhaps because you want to keep a clean house (always purge greater than 30 days old). Perhaps you generate a lot of audit data in a development environment that’s a hassle to delete from OWB RT Browser. Perhaps you rarely visit the OWB RT and don’t want to fuss with keeping it up.

Whatever your reason, please feel free to use the following little script to systematically purge old audit data. It must be run as your OWB Runtime Owner. I’ve not run this by Oracle, or seen them document this API so use it at your own risk. It works, though.

begin
for purge_ids in (select distinct top_level_execution_audit_id id
from ALL_RT_AUDIT_EXECUTIONS
where updated_on

OWB10g Paris : Installation and Security

The OWB team has clearly heard the masses and responded to their difficulty in installing and getting “up and running” with OWB. I have to think back to those first few days of head scratching over the installation and “quick” start guides to remember how tough it was to just get to a point where you’re building tables and composing mappings. Thinking back it seems painless but as I’ve observed at customers and conferences alike there are great difficulties and a learning curve to installing OWB. Smart, intelligent DBAs and BI architects had to spend some amount of time to understand the concepts to get going.

Previously there were numerous steps (I believe the OWB team mentioned a total of 60 steps, but those included “double click on this” and the like). They’ve been simplified by the consolidation of the two repositories. OWB veterans are aware there were previously two primary repositories that one interacts with to get things done. There was the DESIGN repository which held the metadata, ETL definitions, table structures, etc Then there was the RUNTIME repository which invoked and audited the acutal flows, mappings, and deployments being executed. Beyond that, every schema that was going to execute any mappings, or have any data objects also required the installation of a TARGET runtime components so the runtime repository could execute and monitor ETL in that schema. In a stage/warehouse/multiple mart setting, this sometimes meant stepping through the Repository Assistant many times (24 times at one of my customers for a set of development/stage/production environments).

OWB10g Paris consolidates the two primary repositories (DESIGN and RUNTIME) into one unified repository. This greatly eases the burden of installation, which can be done from a simple “New User” wizard on the client. During my testing of the product, we were working with a single database so I was not able to test a multiple database installation. The OWB product team indicates that the original topology is still available for customers wishing to use only portions of certain repositories (ie, use one design, with development/testing/production runtimes).

There is also an augmented OWB user security, permissions, and role based system that supports another great feature of OWB installation. Target schemas (which often make up the bulk of installation time in new environments) allow for new target schemas to be added as a simple afterthought with on the last screen of an add User wizard. OWB objects now have permissions and can be assigned throttled permissions (full ctl, edit, read) by both user and role.

It was noted by OWB product team that the security provided in OWB is enforced for access to the metadata, not the actual deployed objects. If one restricts the “Customer” table in OWB to only those with “INTERNAL_EMPLOYEES” this means that only those users in INTERNAL_EMPLOYEES are allowed to utilized, examine, and otherwise edit the CUSTOMER OWB table. When that table is generated and deployed, it is then subject to the normal Oracle permissions and OWB does not have a method for managing that.

My fellow testers echoed the sentiment that if OWB is really positioning itself to be an inclusive, central location for BI work in an enterprise it will also need to figure out how to capture security and access rules. OWB team product team acknowledged need… We’ll see if it ends up in a major revision or two out.

This blog is part of the OWB Paris Early Review series which reviews and comments on several new Paris features.

owa_pattern.amatch gotcha

I’ve been using regular expressions for years. A few handy cut, sorts, egreps, and uniqs work wonders on datafiles for quick easy troubleshooting. I spent an hour or so tracking down what I think is a funny implementation of regular expression matching. Thought I would quickly post it, and hope google picks it up so that perhaps I can save some poor soul the couple hours I just lost to it.

There is limited information about owa_pattern available; one of the better resources I’ve found is this page on stormloader.com. One of the bits of advice on that page is to:

Or use the second function in owa_pattern, amatch. This function has an IN parameter that dictates where you want to start to match. For our purpose, we don’t care so set it to 0. What we want from it is the type of the return value, integer.

Examine the next two commands and their results, and perhaps you can glean what the issue is with setting the offset to 0
select DATE_UTC, owa_pattern.amatch(date_utc, 0, ‘2004’) from xxxx yields

[12/Jul/2004:00:07:27	0
[12/Jul/2004:00:07:37	0

select DATE_UTC, owa_pattern.amatch(date_utc, 9, ‘2004’) from xxxx yields

[12/Jul/2004:00:07:27	13
[12/Jul/2004:00:07:37	13

The offset, or where to start, doesn’t necessarily mean that that is where the pattern matching will begin to scan for the pattern. It most literally means that the pattern must start at that offset. I’m not a huge expert on regex, but I wonder why a function that is supposed to tell YOU where the regular expression exists in the string requires that it start on an offset specified. Anyone have a perspective or a different experience (perhaps I’m looking at this improperly)?

We've come a long way

I’ve had some moments over the past few days that have all served to solidify that the world has come a very long way in the last 10 years. I recall 1994 where I was about the only person I ran with socially that had an email address, and the idea that I had a website (hosted by a research project for Sun) with my resume was very fringe and nerdy.

How all that has changed… As I write this, I am sitting at a corner seat at Tully’s coffee shop in Seattle, WA with only a power cord connected to my laptop(WiFi is brilliant). I am connected remotely and securely to a customer location working on building a web traffic and customer analysis Data Mart. I have just ordered a 6.0/768 Mbps line for my new home office that will be provisioned in one week. I have ordered Voice service over that dedicated data line for just $30/mo more. The progress is astounding… I am listening to a radio station from the UK clearer than local FM. Put it all together and I, along with others, expect the next few decades of human to truly transform society and life on this planet. These are great times we live in…

OWB10g Paris : UI Improvements

Of course software vendor product teams products shots always show their UI configured in the most efficient way possible leading you to below that the development tool their pitching will be the greatest thing for your efficiency since Venti quadruple non-fat lattes. Many times, the actual product usually falls short and requires a significant curve to “grock” how the product designers thought that YOU would be going about your job. The OWB product team has experience with doing a significant rewrite of their GUI (9.0.3 to 9.2 was significantly different). If those OWB verterans thought that was significant, prepare to be shocked.

The OWB10gR2 GUI will delight those that have not used OWB before as it has embraced some UI paradigms of other IDEs. Those familiar with JDeveloper, JBuilder, NetBeans, etc. will have a much easier time learning OWB. The OWB product team has accomplished this by making two significant UI changes:

Smaller Fonts are pervasive through the product. : I dread having to return the OWB 10gR1 with it’s monstrous fonts that make displaying even the most simple of mappings difficult. This smaller fonts on dialogs, wizards, editors, and navigation are a great enhancement to the product. In the short list of features I would have submitted for the 10gR2 version of the product this would not have been on it. I am very thankful that I was not the one composing the shortlist; the smaller fonts make the entire product much more usable on the whole.
Dockable Panels throughout : The editors and OWB navigator have created a much improved display to support the real work that occurs in OWB. They have created a set of screens that allow the focus to be on the primary work at hand (process flow, mapping, table definition, etc) but allow for the peripheral, oft utilized parts of the definition available in a fast refreshing set of panels surrounding the main event. One panel display the properties for the objects highlighted thus eliminating MANY MANY redudant clicks and scrolls, and highlights, and drills to the attribute. Another panel displays a palette of objects that can be added to the process or mapping. The panels are dockable (make them float), and collapsable (running out of space quickly expand and collapse them).

While most of the benefit of their UI efforts are captured in the two major changes there are other beneficial UI changes. Overall the OWB feels like it is quicker, and more attuned to the work at hand. Birds Eye Views of your edited object, automatic layout buttons all add a certain refinement and really lead me to believe that OWB is really maturing into a robust comprehensive product. Of course, it must be for us BI consultants, ETL developers, architects, DBAs to spend entire days with our noses plugged to the screen.

This blog is part of the OWB Paris Early Review series which reviews and comments on several new Paris features.

Open source ETL

I was visiting with someone the other day who mentioned they might eventually be moving to an area that, because of a variety of factors, is predominantly inexpensive and free software solutions. This person is very knowledgable about databases, ETL, etc. and would probably like to continue their work in this regard. I’ve seen some mention of this before. While poking around the other day, I ran across an open source package for ETL. While it’s simple, and somewhat limited it could serve as a starting point or a very inexpensive alternative for a small MySQL based mart.

Six Apart gets nod from Investors

The makers of Typepad and Moveable Type have long been regarded as the professionals blogger tool. There are copious amount of good, free, open source blogging tools available which is fine for a great many purposes. However, the quality and support from a nominal license fee ensures that those wishing to do a professional looking and operating blog have a source. Ran across someone who thinks highly of the company, as well as the technology on a blog on ventureblog.com.

Of course, there’s more to Six Apart than powerful software. As elegant as I think both Movable Type and TypePad are, to quote myself again, “it’s the people, stupid!” The Six Apart team is phenomenal and getting better every week. I don’t invest in products or ideas or business models.

Note: The blogger has invested in the company… 🙂

VPN over SSL is only acceptable

Companies have different ways of implementing and securing their corporate networks. The ways to secure a corporate network are too numerous to mention here, let alone add any substance to. There are a variety of vendors providing copious amounts of networking and security hardware.

I have a customer that is using an SSL based VPN solution, to allow access for it’s employees and partners from offsite locations. This is one of those technology solutions that looks fabulous on the proposal from the vendor to an infrastructure group, but leaves a little to be desired when it comes to using it on a day to day basis.

Take for instance, a Juniper Networks product formerly called NeoTeris. The premise is simple, install the device in your “Public Zone” and give it access to your corporate network. Configure it to authenticate against a Windows Domain server. Employees hit a website, secure.mycompany.com, log in with their username and password, and then it’s just like being on the corporate network without any special software. Well, not exactly.

There some limitations to this approach… Mostly related to performance, but some as it relates to functionality. The implementation of this system involves the download of an applet that routes traffic via SSL to and from the remote server. When access remote applications (SQLNet, JDBC, VNC, ssh, Windows Share, etc) the most notably drawback is latency. The network packets are being processed by an embedded java applet before being routed over SSL, which for those who are familiar with those two technologies you’ll understand why they aren’t well suited to quick, instantaneous packet forward. Throughput is minimized as it appears (I don’t know if this is, in fact, true) as if the applet is chunking data into packets according to a rigid configuration, rather than using the great TCP/IP feature of ratcheting.

The features are somewhat limited in that because a VPN system has just placed you on the remote network, you are required to configure applications port by port. For most applications this is acceptable, as you connect through them to a certain port, and life is good. There are some applications that change up their ports, and come configured by default to “move to another port” or are built to do so (you can’t turn it off). Because one must configure the ports of the system individually, a user of the SSL based system has no chance of using these applications because they couldn’t possibly guess the port ahead of time.

Overall it’s an application that provides remote access for a great number of people, with a great provisioning model, works for most applications, and is alright for occasional home based use. My experience stretching one customers installations to the limits to do my work for them, indicate that it’s not very suitable for remote workers. Remote workers that expect their remote applications and access to a corporate network be a similar experience to their commuting peers, will be disapointed with VPN over SSL. Companies should know that if their needs are limited, this is a good solution. If you have remote workers who might expect a good remote experience, they should stay away from VPN over SSL.

6 weeks and some new digs

For those paying attention (there are a few of you out there), you’ll notice that my blog (formerly nicholas goodman, consultant extraordinaire) has been folded into a new bayon blog.

bayon, my small boutique BI consulting firm, may invite other people to contribute to the world of Oracle, BI, Grid Computing, etc.

You may also have noticed that there has been a significant gap in time since my last post. The upside of my recent relocation from Boston, MA to Seattle, WA is that I’m in a beautiful new city with so many outdoor activities to chose from. The downside is, that for the past 6 weeks I’ve been surviving my projects, moving, etc.

Hope to get the observations flowing soon enough.

Goodman on BI

Thoughts on Open Source, Analytics