Feelin' the ORM blues

As I’m feeling blue, I’d like to intercept the series about the Javascript type system with something that preoccupies my mind for quite a few days.

There is something about object/relational mapping frameworks (ORM) that severely bothers me. Maybe you have had similar experiences. For example, when you come into an enterprise software project as a new team member and you get your introductions into the architecture, there nearly always comes this point when you hear something like: “We have OR mapper ‘XYZ’. We’re using it for all our data access throughout the whole project.” (with ‘XYZ’ being ‘NHibernate’ for many .NET projects older than 2-3 years).

These are situations when – even as a not so much religious person – the beginning of the Ten Commandments comes to my mind:

I am the Lord thy God […]. Thou shalt have no other gods before me.

Exodus20:2-3

It’s perfectly understandable that we as software engineers want to avoid any unnecessary proliferation of diverse methodologies to solve exactly the same problem within the scope of one project. Clearly, there should be only one way of doing it. But then again, what is the problem that ORM frameworks are trying to solve? Obviously it’s not the actual data access, that would be too easy. The problem they are trying to solve is called the object-relational impedance mismatch. This problem arises if (and only if) you expect your relational database to be what it is not: object-oriented.

But what if you do not expect that? What if you want to use the database as this external resource where you can send commands to? Maybe you don’t want to read anything and perform just a bulk update or deletion of certain rows? What if you want to check for the existence of data matching specific criteria but you only want to receive 1 or 0 as an answer? Isn’t that a very different story? Isn’t that in no way related to the object-relational impedance mismatch problem? Yet bypassing the “official data access layer” (consisting of the one and only ORM) often is conceived as breaking something that is sacrosanct. Tooling, introduced purely by the necessity to solve a very distinct problem, can quickly become a general dogma.

At the root of the dilemma I consider – as always – good intentions: The raise in popularity of object-relational mapping frameworks seems to be directly coupled to the raise in popularity of domain driven design (DDD) in the early 2000’s. In particular, Martin Fowler’s Domain Model pattern is (or at least was at the time of publishing) considered best practice to implement complex business rules in a complex domain. Personally, I cannot say if I like this pattern, because I’ve never seen it in real life (and I wouldn’t win an award for passionate devotion to the “pure” object-orientation doctrine, anyway). All I have seen until today is its – probably way easier to implement – little brother, the Anemic Domain Model, an anti-pattern according to Fowler. However, apart from any judgment, I can clearly see the advantages of using an O/R mapper in the first approach, whereas for the second (anti-)pattern it doesn’t seem to matter that much.

I don’t want to claim that ORM frameworks were superfluous. There are many scenarios where I’m glad to have them. E.g. when I’m dealing with a complex object graph that can be manipulated in a lot of ways (read: code paths) according to complex business rules and then has to be written back to the data store: Thank you, Entity Framework or NHibernate for being there and tracking everything! But then again, given the project in question is big enough, when devoting your whole data access story to a single O/R mapping framework, the following things most likely will happen (at least in my experience):

  • You will bombard your database with highly unoptimized queries
  • You will constantly shovel significantly more data over the wire than you actually need
  • You will easily experience deadlocks that are hard to trace
  • You will get issues with caching and stale objects that are hard to trace
  • You will get a hard time tracking down the origin of a cryptic query you found in your log files
  • You will have to write a lot of code that is very hard to understand only to construct a medium-complex SQL query
  • You will most likely have no chance at all to write code that translates into a more specialized SQL query
  • You will not be able to leverage vendor specific features
  • You will have to load thousands of rows into memory just to set one column to a specific value
  • You will begin to design other parts of your application(s) according to the O/R mapper’s abilities

From all these bad things it is the last one that makes me most furious. When in an architectural meeting someone has to say the words: “We can’t do this. Our data access doesn’t work that way.”, well, it makes me want to bang my head against the wall.

It seems, we developers are very picky in order to avoid any buy-in to a specific technology, but when it comes to data access (and, by the way, also logging and dependency injection), we just let the frameworks (or at least their methodology), once “carefully” chosen, leak through many other parts of our applications, mostly even un-abstracted. Every component has to adhere to their rules. This is the one way we do it, and there should be only one way of doing it, right?

As much as I would prefer it to not be a comparison between the business of software development and the horrors of a war, the second sentence in Ted Neward’s statement about object-relational mapping is so true in my opinion that I cannot desist from citing it here:

“Although it may seem trite to say it, Object/Relational Mapping is the Vietnam of Computer Science. It represents a quagmire which starts well, gets more complicated as time passes, and before long entraps its users in a commitment that has no clear demarcation point, no clear win conditions, and no clear exit strategy.”

No clear exit strategy? That doesn’t sound too good! Now you know why I’m feeling blue …