Thursday, September 5, 2013

Old SQL, new SQL or no SQL at all, now that's the question

(image by luana1985)
It's about time to blog something about NoSQL since that is on top of it's hype cycle according to Gartner (as of July 2013, and actually key-value DBs are already sliding fast into the Through of Disillusionment already). And boy, does it show? Last summer I dived in the vast depths of Youtube searching for interesting talks and presentations around this subject and at least most of the younger lads presenting their favourite product (which of course is totally awesome) acted like that is the key to everything. There had even been a "Battle of the Backends" in Google I/O 2012. Ok, have to admit, that was quite entertaining.

As a sidenote, I kind of hope that the confusing name NoSQL would soon be forgotten as some of the products anyway support some sort of structured query language (if not even the standard SQL) and the alternative name used by some, NewSQL, is not much better since the "old" (R)DBMSs are starting to gain features from the "new" products. Besides, the basic ideas are not all that new anyway, for example Lotus Notes had a document DB already way back in the 90s. And what comes to in-memory DBs, even Oracle has TimesTen that can act either as a memcache or as a independent in-memory (R)DBMS. I don't think it is even that useful to bundle together things as different as key-value stores, column-family stores, documents stores and graph DBs (and what else there is). So it's probably better to talk about them separately without using the buzzwords.

Back on the track... Marketing speeches aside, it's not likely that the new approaches are the key to everything (the older lads probably had already seen the "coming" of object DBs and knew it better). I don't know who conducted the MySQL-SenseiDB comparison tests published on SenseiDB website but if your solution somehow looks like 100x faster than a widely used solution, while running a most simple test scenario, it is quite likely that either your test setup is flawed or you're outright messing up the results. Well, maybe there is a reason why their web site is also still in the year 2012 now in Sept 2013...

I think I introduced myself with Youtube and some textual resources to at least MongoDB, RavenDB, Cassandra and Hadoop, with mostly more than one longer presentations plus a couple of clips on the current DBMS scenery in general and hey, I admit, many of those were just terrific talks and I did get enthusiastic about the subject. After the upsides of the non-RDBMS approaches were starting to dawn on me I could quickly think of two example cases where an RDBMS was not quite giving adequate performance. Unfortunately the other of them is tied to proprietary software and likely will support a non-RDBMS in production somewhere in the next decade - if ever - even though e.g. a document oriented approach might fit it very nicely. With the second case I'm free to try out everything myself, and inspired by some of the DB design related presentations I already did some changes to the DB design on the current platform (MySQL) with which I was able to get 50% off from a particular query that took very long. However, that was just plain old denormalisation, nothing new in that, and the initial design might have been bad in practice, anyway, even though it was kind of a textbook case of a relational DB model.

What I intend to do is to try out similar designs on both MySQL and a document DB (probably MongoDB), since the data lends itself quite well for document-oriented design. This will mean having to stuff arrays of things in a single field in MySQL but as I'm most interested in the DB performance that is ok (and subsequent parsing in the application shouldn't be much of a problem, either). The thing I'm most curious about is how easy it is to design the DB so that it still allows querying from different angles. At the moment my expectation is that I probably need to duplicate the data in some parts to allow that, but then again that is also required in reporting DBs (think of cubes).

Stay tuned (but don't hold your breath since I can't promise the next post will arrive before your brain cells die of asphyxiation)...
submit to reddit Delicious