About

Welcome to Panela, Matt Harrison's take on mostly Open Source, Linux, Python, innovation in those areas, other buzzwords and Dick Proenneke. It comes complete with the illustrations as needed. Note the opinions expressed here are merely my opinions and not the opinions of my employer.

about Matt

Calendar

««Jul 2009»»
SMTWTFS
    1234
567891011
12131415161718
19202122232425
262728293031

Mailing List

My RSS Feeds








MonetDB and SQLAlchemy

posted 2007.11.19 Mon

I've spent the past days starting a MonetDB backend for SQLAlchemy. My code is currently attached to ticket 874. We're already using SA to pull data from a proprietary db and are hoping that we can streamline some queries using MonetDB. MonetDB is a column-oriented database (the benefits of Column-oriented dbs are left as an exercise to the reader (and wikipedia)).

Here are my thoughts after spending a bit hacking out the backend (which is really slightly usable if you have nightly of Monetdb):

  • I was thinking writing the backend would require me knowing indepths the guts of how monetdb handled sql. It turned out to be pretty easy. I just used kdiff3 to look at the backends of firebird and sqlite (since they were short). The basic idea of a backend is to wrap dbtypes to SQL statements (when used in columns) and convert types back to native python types when pulled out of queries. Then there is a Dialect object that knows how to reflect ones database. Most of my time was really spent getting that working. The only other thing that (so far) has proved annoying is auto increment columns, I'm using a combination of 4 booleans to determine that (including one called autoincrement....).
  • Though SA has a big wrapper around unittest that gave me a few minutes pause, I finally figured it out. Writing unittests was a great way to get some basic functionality.
  • It'd be nice if SA included a SQL99 or basic compliance testsuite for all the backends. All the tests appear to be silos of tests that may or may not overlap. I pulled/modified tests from sqlite and mysql for monetdb and there wasn't much/any overlap in those tests...
  • On that note I started writing doctests against the documentation for SA. This revealed a bug in monetdb, that hopefully should get fixed soon.
  • So, while the code is still in a nascent state, testing (both unit and doc) have allowed me to progress quite far and quickly.
  • The Monetdb people have been a pleasure to work with. They have been swift to answer questions and fix bugs (3 so far...).
  • One final note. The world of opensource is small. I previously had the pleasure of working with one of the monetdb dev's on a non-python, non-db related opensource project. Cool how you can run into great developers again.

So if you are in the market for a column-oriented db and using python (and feeling brave and adventurous) give the new backend a try.

tags:        

links: digg this    del.icio.us    reddit




1. mike bayer left...
2007.11.19 Mon 6:46 pm

hey Matt -

I had never heard of MonetDB before (or maybe i just forgot the name) and that is interesting that it is a column-oriented database, something i have no experience with. Within SA's test suite, the "select.py" test, the first test suite SA had, is the general "test that the compiler produces the expected SQL" suite. It tests the compiler primarily in its "generic" form, and in some cases branches out to individual dialects; as you noticed, there are also dialect-specific tests in the dialects/ directory. The reason most of the sql compilation tests are only against the generic dialect is because thats where most of the SQL generation is; the test cases are produced based on a "code coverage" model such that our goal is to ensure that every line of code is covered by a unit test. A dialect like Postgres only overrides a few methods of the base compiler so the "pure compilation" tests dont add much when run against PG versus the base dialect.

Its more in the area of actual roundtripping of data to the database that *all* databases are worked very heavily. The primary issues that present themselves across dialects are usually quirks that are specific to the DBAPI interface (which in turn is usually due to the quirks of the databases client API), and idiosyncrasies in typing behavior.

So the real test for MonetDB is if you can get the full suite of unit tests to run against it; for the "fringe" databases, like firebird, maxDB, informix, etc. a lot of those tests had to be marked as 'unsupported'.

Anyway your code looks great and if you are interested in building up generic "SQL compliance" test suites, I can certainly get you right on board with our codebase. Feel free to stop by on IRC as well since all the developers are usually there during the day in the US/europe.


2. Matt left...
2007.11.19 Mon 11:15 pm

Mike-

Thanks for SA and thanks for taking the time to comment. Your sql/* tests are similar to what I had in mind when I said SQL compliance, I hadn't seen them previously. Now I have the general sql tests running (though about 40 odd failures). Hopefully I can get this backend into shape pretty quickly. Like I said, the devs are pretty responsive with my bug fixes, which should take take of about half those test failures.


3. bsp left...
2009.03.05 Thu 12:07 am

very cool. any progress on getting this into sqla other than what's on the ticket?


4. Matt left...
2009.03.06 Fri 8:01 pm

bsp - The SA trac is up to date. Though I have/had SA commit rights, I ended up using postgres instead of monetdb. But support for .4.x was basically done, the only thing missing was row ids (which are needed for orm operations). If you only wanted to use the sql wrapper (which is probably the case if you are using monetdb), the code should be good to go. Sorry I don't have more time....


5. Gijs Molenaar left...
2009.07.03 Fri 6:54 am :: http://monetdb.cwi.nl

I made a rewrite of the Python API for monetdb. I don't have much time left, but I'll check if I can get the sqlalchemy code running. If it works I can see if we can bundle it with the monetdb python api.


6. Gijs Molenaar left...
2009.07.03 Fri 6:55 am

I made a rewrite of the Python API for monetdb. I don't have much time left, but I'll check if I can get the sqlalchemy code running. If it works I can see if we can bundle it with the monetdb python api.