John Sequeira

Amped::Technology

Thursday, December 08, 2005
Windows Live Local

is a dumb name.

But the bird's eye view of my office is somewhat redeeming.
2:37:57 PM      comment []  trackback []

Open Source Olap Options

Three times in my life I was able to take a query that the client had running > 10 minutes and bring it down to under 10 seconds. In one case, some long-gone contractor decided to implement an OO system in Cold Fusion. When CF started up, his code precached the objects one-by-one -- executing several Load methods (which each kicked off several queries) for each of the 10K objects that the server needed in memory. When I looked at the code, it was apparent that he wasn't actually pulling from all that many tables, and the individual queries could simply be JOINed together. So I refactored these 40K SQL statements into 3, and, well, it speeded up. This is the consulting world equivalent of getting thrown a slow, looping softball pitch, but the client was happy.

In the other two cases, the changes were a bit more extensive. They involved replacing some type of RDBMS with an OLAP server, which precomputes aggregates (sum's, avg's etc) so they're just sitting somewhere in memory for you to request. It's a really useful technology if you're a db implementor, and I like to say it's a lot like history: know olap, or doom yourself to reinventing it.

So far, all the warehousing-type projects I've done have been on Microsoft's OLAP server, but I've been keeping an eye out for open source equivalents so that I could work similar blistering-query magic on my OSS database jobs.

The open source OLAP landscape has been pretty limited for the last few years, but things have heated up alot recently. Currently, I know about the following options:

  • Mondrian is a db agnostic OLAP implementation, author recently acquired/hired by Pentaho (good move Pentaho!)
  • Olap4All - an ISV offering ROLAP on top of MySQL (not open source, but low cost I think)
  • Bizgres - coming soon, a Postgres variant tuned for analytics, mentioned here a few days ago. Not sure how much of the good stuff will be open source'd and pricing is TBD
  • Palo - a cross-platform, multidimensional in-memory store w/client api's for php, java, .NET etc. Possibly crude on the server tools side, but pretty slick screenshots for their Excel plugin.

    I just stumbled across Palo a few days ago, and it looks really promising. It seems quite complementary to open source database adopters, and the Excel plugin screenshots look extremely professional.
    1:10:01 PM      comment []  trackback []


  • © Copyright 2006 John Sequeira.
     
    December 2005
    Sun Mon Tue Wed Thu Fri Sat
            1 2 3
    4 5 6 7 8 9 10
    11 12 13 14 15 16 17
    18 19 20 21 22 23 24
    25 26 27 28 29 30 31
    Nov   Jan

    Click to see the XML version of this web page.
    Click here to send an email to the editor of this weblog.
    Yahoo: johnseq2
    MSN: [email protected]
    AIM: amped02139
    Skype: johnjulian