Wednesday, July 31, 2013

Polyglotism

Lately, I've been educating myself on the topic of what is commonly called NoSQL, and this nice presentation by Martin Fowler made me think about stuff I've already written about earlier. The same thing can also be found on a talk on data storage technologies employed in Craigslist.



The key phrase was polyglot persistence (@~51 mins in the video). Not that I wouldn't probably have heard it already earlier, but now the ultimate meaning finally dawned on me. It means, that to build a system - one that is highly performant, at least - you probably need to be able to take advantage of the different DBMSs that are around. The same way that multi-lingual developers push the idea that you need to use the language that is right for the job. So the diagram Fowler gave on polyglot persistence could be extended with programming languages with which the different parts of the system have been built with, and probably also with OSs the parts run on, and so on. There is a reason for dozens of DB brands and programming languages and whatnot existing, and it is the attempt to produce proper tools for specific problems. I can see this could also lead to a sprawl of diversifying computational environments, but I guess the constant popularity contest that goes on between products makes either the products converge or the niche options quietly fade away with the more popular ones gaining market share.

All this polyglotism further widens the options for a fresh wannabe ICT specialist. There certainly is a need for narrow deep expertise (to be able to know something like the Spring framework from top to bottom one really needs to spend considerable time with that alone), but I do not see that it would become very common for companies that are not actually huge in size and less than 100% technology oriented to hire bunch of specialists of different areas. Or they could, but then again they could hire them just as well as consultants since they are not likely to need each one of them equal amount of time constantly, and if the team members do not know much about each others areas of expertise they really can't be assigned to work on the same task together. My experience from the service sector is that there is a huge need for a team of people small enough that it can constantly be kept busy with the ongoing tasks and that you can throw at any kind of a problem and they can handle that based on their collective wide experience.

What I am after here is that even though I've been mostly using Oracle (from version 8 onward, I think) as the data back-end in my professional life,  the potential of interesting non-RDBMS encounters is highly rising. All along the road I've been running into cases where it is not feasible to stick to the good ol' 3NF of data and instead denormalising to get performance - something that the NoSQL folks are very fond of. This far it has been a convention that if an application needs a DB (bigger than can be reasonably embedded within the product on the same server), it supports one or more of the big traditional RDBMSs. Now I already see RHQ including Apache Cassandra in the package, and even though Cassandra comes bundled with the product, one basically needs to install it on a server of its own for practical use - so basically it can be said RHQ requires Cassandra in addition to one of the supported RDBMSs. Personally, this made me think of how I can sell the idea of requiring another server for application server monitoring to my superiors...
submit to reddit Delicious

No comments:

Post a Comment