Database Operation Complexity Reference

I mentioned this previously, but I’ve been reading “Principles of Distributed Database Systems.” I’m enjoying it, and it’s helping me solidify many of the concepts that I apply daily in my capacity as a Principal BI Solutions consultant. Database theory, and specifically as it relates to tuning is part of any professionals work with Oracle. We’ve all deciphered the performance implications and clues for improvements from EXPLAIN PLAN. I’ve always been told you want to be able to give Oracle the clues/configurations to enable filter of results (selection) before joining. It always made sense but I had never understood fully the concepts behind these recommendations. Until now…

I don’t espouse to understand all of the complexities behind these issues but I ran across a great reference chart. I wanted to post it here along with some numbers for demonstration as a quick reference for any professionals with understanding of quadratic, logarithmic, and linear scales to match those up with the operations we use on a day to day basis.

This is from the book mentioned above, however I’m sure it’s rather common knowledge.

OPERATION COMPLEXITY
SELECT O(n)
PROJECT (without dedup)
PROJECT (with dedup) O(n*log n)
GROUP
JOIN
SEMIJOIN
DIVISION
SET OPERATORS
CARTESIAN PRODUCT O(n²)

A quick look at some numbers in the orders mentioned yield the following “costs” in n operation times.

n O(n) O(n*log n) O(n²)
5 5 3.49 25
10 10 10 100
100 100 200 10000
1000 1000 3000 1000000
1000000 1000000 6000000 1E+12

Hope this helps provide a useful way to match up the database operations we use on a daily basis with the theoretical cost of such operations. Cheers!

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>