Dotnet Thread: Impedance Mismatch Reframing

This is a reply to Stephen Fortes post Impedance Mismatch from a ways back. I would have posted about it sooner but I sadly just saw it today when a co-worker Stefan Moser linked it over to me. I know that this debate has become quite heated through the community and as such will refrain from personal attacks (such as those unfortunately experienced by Julia Lerman) and focus solely on the technical merits of the post.

My first problem with ORMs in general is that they force you into a "objects first" box. Design your application and then click a button and magically all the data modeling and data access code will work itself out. This is wrong because it makes you very application centric and a lot of times a database model is going to support far more than your application.

Well I wouldn't say that this is a problem with ORMs per se but a problem with some tools. Those who are using Domain Driven Design are certainly not using this methodology, one of the main reasons I like to tell people to use DDD is that they can design their data storage mechanisms in parallel to their domain model seeking an optimal solution to each. In other words we should be embracing the impedance mismatch and doing what is best on both sides. The paragraph then continues with

In addition an SOA environment will also conflict with ORM.

I do not necessarily agree with this in any way shape or form but am happy to leave it left open to "the many definitions of SOA". I think it can quite easily be done if you follow solid command query separation. Udi Dahan gives a nice discussion of this on his blog.

Later in the article (I am jumping around a bit to keep my own post coherent)

One of the biggest hassles I see with LINQ to SQL is the typical many-to-many problem. If I have a table of Ocean Liners, vessels, and ports, I’ll typically have a relational linking table to connect the vessels and ports via a sailing. (Can you tell I am working with Ocean Freight at the moment?) The last thing I want at the object layer is three tables! (And then another table to look up the Ocean Liner that operates the vessel.) Unfortunately, this is what most tools give me. Actually I don't even want one table, I want to hook object functionality to underlying stored procedures. I really want a port object with a vessel collection that also contains the ocean liner information.

The author discusses his experiences with Linq2Sql and then applies it to "what most other tools give me", this is an unfortunate fallacy or a lack of research on available tooling. Linq2Sql is not a real "mapper" nor is what the author referring to "mapping", it is simply an Active Record implementation that is not using self-serving objects. This is what happens when mappers stay too close to the relational structure, they suck in terms of domain language and structure.

If we were however to use a real mapper (let's say the one those notorious mafia guys are using) a quite different scenario would exist; a domain that sounds almost exactly like what is described as being wanted. This paragraph is also key in showing that research has not been done into Domain Driven Design by the author, I would bet that Stephen and Eric could have some really interesting discussions at the Advisory Council as Eric uses this exact problem domain as a naive starting point for examples in about half of his book.

A more serious problem is shown though in the authors propensity towards a relational bias when domain objects are called "tables". Why would anyone have a domain full of "tables"? These are behavioral objects. Unless this misunderstanding of what a domain model is is corrected the rest of what a domain model is or does will never make any sense.

A further lack of understanding of Domain Driven Design is shown with the statement of..

ORM is real good for CRUD and real bad at other things.

Again I believe the author has become confused between ORM and Active Record for some reason. I would never under any circumstances recommend someone to use Domain Driven Design for a CRUD app as there are easier ways (like using Active Record)

Although it may be surprising, it is my belief that the author is actually a Domain Driven Design aficionado but has just not yet realized it yet.

I prefer to build the application's object model and the data model at about the same time, with a "whiteboarding" approach that outlines the flows of data and functionality across the business process and problem set.

It is quite common in an "object first" perspective to be either doing database and code modeling either in small iterations or in parallel where a team of object experts focus on the domain model and the best way to model the data in order to support transactional behaviors while a team of database experts focus on how best to store the data given their own set of requirements. These types of sessions would in fact be prescribed in an agile team and the small "whiteboarding" sessions are absolutely prescribed by Domain Driver Design.

Maybe it is the MBA talking but I tend to be "business and customers first" when I design a system. (Those of you that know me know that I have designed some very large and scalable systems in my day.)

This is one of the core beliefs of Domain Driven Design, the primary example would be the creation of an Ubiquitous Language in order to ease communications between the "business and customers" and the team.

What I am saying (and have been saying for a long time) is that we should accept, no, embrace the impedance mismatch! While others are saying we should eradicate it, I say embrace it.

Again we are back into agreement with Domain Driven Design. I like to look at Domain Driven Design as being an orthogonal architecture, my domain survives through anything that is moved around it as it is the core of my business and where the largest amount of my investment has gone.

We come now to where the author is unfortunately not in line with DDD but perhaps can be moved. The only way that one can reach an orthogonal architecture is to ensure the purity of the domain model. The OLTP RDBMS will eventually leave in popularity, what happens when I want to move to say "the cloud" and just store my aggregate roots as XML, this is a perfectly valid and extremely effecting architecture. If I favor too heavily the RDBMS side of the impedance mismatch then this change will not be orthogonal to my domain but will be extremely costly. The author would disagree with my reasoning as he points out.

ORM tools should evolve to get closer to the database, not further away.

and

Developers who write object oriented and procedural code like C# and Java have trouble learning the set-based mathematics theory that govern the SQL language. Developers are just plain old lazy and don't want to code SQL since it is too "hard." That is why you see bad T-SQL: developers try to solve it their way, not in a set-based way.

and

So ORMs are trying to solve the issue of data access in a way that C# and VB developers can understand: objects, procedural, etc. That is why they are doomed to fail. The further you abstract the developer from thinking in a set-based way and have them write in a procedural way and have the computer (ORM) convert it to a set-based way, the worse we will be off over time.

Well I think I have already discussed the first of these points pretty well, by moving closer to the database we break our hopes of an orthogonal architecture. The second comment albeit sounding like it came from a grand and mighty sql wizard sent down by the gods to lift us heathen from our sinful ways is actually a red herring as is the third when framed properly.

I do know relational algebra (yes I can tell you what an anti-join is) and I challenge anyone to show me notation for an insert. While one could argue it can be involved with say a delete by PK/FK or update by PK it is for all intensive purposes useless in the process of writing to a properly normalized database, these items tend to be procedural regardless. I will admit there are times where it can come in handy but they are by far the minority. The relational algebra is focused on reading data and manipulating sets.

As many who have had long post-conference talks over beer with me know I find any query that is of any amount of complexity close to thinking about the relational algebra to be a report. Reports are not expressed within my domain and may or may not be read from the same data source (I often times use an eventually consistent reporting model specifically for the purpose of running such queries). I take this often to extremes, my repositories in an ideal world have a single read method, FetchAggregateByUniqueId. Anything that is searching in a more complex nature is deemed a report and sits outside of this (usually as a small mapper that returns DTOs that are bound for the screen). My "reports" all make very strong use of SQL and Relational Algebra, my domain has no need to know that it exists as it is essentially a write only model. I could go much more into this but it is another post.

Getting back to the article, the author does however end off with a great quote from Ted Neward:

"Developers [should] simply accept that there is no way to efficiently and easily close the loop on the O/R mismatch, and use an O/R-M to solve 80% (or 50% or 95%, or whatever percentage seems appropriate) of the problem and make use of SQL and relational-based access (such as "raw" JDBC or ADO.NET) to carry them past those areas where an O/R-M would create problems."

This is great advice ... just remember if you do it to hide it from your domain and to use it sparingly as you may not always have a RDBMS sitting behind you and if you don't these set based operations may be quite difficult to implement.

Source Click Here.

Dotnet Thread

Sunday, July 27, 2008

Impedance Mismatch Reframing

No comments:

Post a Comment

Originals Enjoy