Wednesday, January 21, 2009

Why Should You Use an ORM?

A user recently asked for good arguments in favor of using Object/Relational Mapping technology:

If you were to motivate [sic] the "pro's" of why you would use an ORM to management/client, what would the reasons be?

Try and keep one reason per answer so that we can see what gets voted up as the best reasons.

I offered four answers. The first three got the most votes, but my last answer got little interest.

  1. Speeding development. For example, eliminating repetitive code like mapping query result fields to object members and vice-versa.
  2. Making data access more abstract and portable. ORM implementation classes know how to write vendor-specific SQL, so you don't have to.
  3. Supporting OO encapsulation of business rules in your data access layer. You can write (and debug) business rules in your application language of preference, instead of clunky trigger and stored procedure languages.
  4. Generating boilerplate code for basic CRUD operations (Create, Read, Update, Delete). Some ORM frameworks can inspect database metadata directly, read metadata mapping files, or use declarative class properties.
There are lots of other reasons for and against using ORM frameworks. Generally, I'm not a fan of ORM's, because their benefits don't seem to make up for their complexity and tendency to perform slowly. Their chief value is in reducing the time taken in repetitive development tasks.

Hibernate, for example, is about 800,000 lines of code (Java and XML), but it's complex enough that I doubt it's easier to learn or to use than SQL. Besides, there seem to be fundamental tasks, such as a simple JOIN that are impossible to do through the entity interface. Please correct me if I'm wrong, but I've been searching tutorials and examples and I haven't found a way to fetch a joined result set from two entities, without writing a custom query in HQL (Hibernate's abstract version of SQL).

I was also led to a blog by Glenn Block, titled "Ten advantages of an ORM (Object Relational Mapper)." I disagree with Block on several points. He cites some traits of ORMs as advantages where I see them as defects. He also cites features that are not specific to ORMs; they could be achieved with any type of data access library.

update: Upon request, here are some specific comments on Glenn Block's list of advantages of an ORM:

1. Facilitates implementing the Domain Model pattern

Not necessarily. I can design Domain Model classes containing plain SQL as easily as I can design classes that operate on the database via an ORM layer. Keep in mind that ActiveRecord is not a Domain Model.

2. Huge reduction in code.

Depends. When executing simple CRUD operations against a single table, yes. When executing complex queries, most ORM implementations fail spectacularly compared to the simplicity of using SQL queries.

3. Changes to the object model are made in one place.
This is not a benefit of an ORM. Many people use ORM interfaces inexpertly, so when the database structure changes, they still have to update many places in their application to reflect the change. But instead of redesigning SQL queries, they have to redesign usage of the ORM. There is no net win. They could structure their application using plain SQL queries and still be as likely to achieve the benefit of DRY.
4. Rich query capability.
Absolutely wrong.
5. You can navigate object relationships transparently.
This is definitely a negative rather than a positive. When you want a result set to include rows from dependent tables, do a JOIN. Doing the "lazy-load" approach, executing additional SQL queries internally when you reference columns of related tables, is usually less efficient. Leaving it up to the ORM internals deprives you of the opportunity to decide which solution is better.
6. Data loads are completely configurable ...
This is not a benefit of an ORM. It is actually easier to achieve this using plain SQL.

7. Concurrency support.

Again, not a benefit of an ORM.

8. Cache managment.
This has nothing to do with using an ORM. I can cache data using SQL.
9. Transaction management and Isolation.
Also has nothing to do with using an ORM versus a more direct DAL.
10. Key Management.
Ditto.

I'm posting to my blog the questions I've answered on StackOverflow, which earned the "Nice Answer" or "Good Answer" badges. This was my answer to "Why Should You Use An ORM?"

36 comments:

kezugo said...

and another question is, which orm is the best one in php world ?

Bill Karwin said...

@kezugo: You're right, that is another question.

The choices are basically:

- Propel
- Doctrine
- Zend Framework's Db-Table
- Maintainable Framework's Model

Of course there are other PHP ORM's in existence, but I believe those four are the highest quality.

Each has their own strengths and weaknesses, and the decision of which is best is subjective. You should read each of their documentation and decide which fits your needs and your style best.

Bill Karwin said...

I forgot one, Qcodo. Those guys put in a lot of work on theirs too.

Cerebrado said...

Sorry... Did you say "encapsulation of business rules in your data access layer"?
I thought that data had the responsability to persist data. And you could use another data layer (to hit another database, for example) and it should work.
I don't think it's a good thing to mess business rules with data persistance.

Bill Karwin said...

@Cerebrado: You're absolutely right. That phrasing was not well thought out.

I do recommend keeping business rules separate from the DAL. Please read http://karwin.blogspot.com/2008/05/activerecord-does-not-suck.html

Boris said...

Very lucid and well-written post.

I was disappointed you didn't elaborate on your disagreements with Glenn Block's post, however.

You said:

"He cites some traits of ORMs as advantages where I see them as defects. He also cites features that are not specific to ORMs; they could be achieved with any type of data access library."

The mystery is killing me.

Bill Karwin said...

@Boris: Thanks for your comment! See my update in the main part of this blog article, above.

Phil John said...

As for querying a joined result set without using HQL (or in any ORM) you could always create a view on the database and have a domain object created from that - it's how I do it in the .NET world.

Russ said...

ORM is alright, but what if you could store info without doing any data-mapping? Enter OODB (object-oriented database)

With an OODB, you just put your objects into storage AS-IS.

I'm in the process of making CouchDB work as an OODB for CFCs. So far I've got it working for basic operations:
objectID = OODB.save(myObject); //save object
OODB.load(myObject, objectID); //repopulate an object

If you'd like to try it out, go install CouchDB and then check out http://couchdb.riaforge.com
(You'll find the OODB stuff in the SVN only, not in the project ZIP file)

Boris said...

@Bill - Thanks for the update!

The thing that sticks out at me right away is #5. As I see it, the whole point of the ORM is to translate between a set-oriented model and an object-oriented one. So, to the extent that the ORM makes it easier to navigate entity relationships from the object world, I would see that as one of the foremost advantages in fact.

Boris said...

FYI there's been some discussion of this post on Reddit:

http://www.reddit.com/r/programming/comments/9dsnu/why_should_you_use_an_orm/

Bill Karwin said...

@Russ and @Boris: Thanks for your comments. To both of you, I would answer that OODBMS and ORM works only on objects that we've instantiated in the application layer. I.e. there's no way to do a query like this:

UPDATE Bugs SET status = 'CLOSED' WHERE status = 'OPEN';

To do this in an ORM or an OODBMS, you'd have to fetch all bugs that match the criteria and instantiate objects for them. Then you could set the attribute and save the objects back to the database one by one. This is expensive and certainly requires more code than the equivalent SQL operation shown above.

This illustrates an advantage of a language like SQL that treats sets as a first-class data type. The OO paradigm cannot substitute for the relational paradigm in all cases. There are some ordinary operations that SQL can do much better.

Another is referential integrity. This is a fundamental quality of the relational model, and there's no equivalent in an OODBMS or an ORM (the ORM allows navigation through references, but cannot enforce referential integrity).

peanutz said...

I added a question on StackOverflow about this post. Didn't want to bother you about it here as it is probably too rudimentary.

http://stackoverflow.com/questions/1337802/can-you-represent-an-application-object-in-a-way-that-a-relational-database-can-u

thebman said...

Great article. I agree wholeheartedly. I have yet to participate in an ORM project which did not at least double the cost of development AND double the server cost. Successful ORM based applications which do not tax the database are very difficult to create, and training people to understand your ORM app can be very difficult. It usually takes about 6 months on the job before developers really have a handle on ORM, now that's expensive. And if you plan on massive data growth... watch out! Plan for a redeployment on a clustered solution in short time and better buy some expensive hardware to support that solution. ORM makes performance tuning near impossible, leaving you with throwing money and servers at the problem as your only alternative.

Ennio Wolsink said...

So reading this, I'd have to conclude: ORM, ok to use, but whenever you notice it's actually more expensive than regular SQL code, don't use it?

I.e. let your Models decide wether or not to use an ORM for a specific task?

Bill Karwin said...

@Ennio: As far as I can tell, the only real value of an ORM is to reduce the tedious getter-and-setter code and mapping object attributes to database columns. All other supposed advantages to an ORM can either be achieved without an ORM, or else they are actually weaknesses.

So I imagine if one could write a code-generator that helps remove the repetitive work from writing Domain Model classes, without any ORM-iness, that would be just as good, and avoid the pitfalls.

Also you might be interested by this blog by PostgreSQL expert David Fetter. In short, he writes: "Stop trying to generate SQL."

http://people.planetpostgresql.org/dfetter/index.php?/archives/40-Removing-Much-of-the-Suck-from-ORMs.html

Jason said...

@Bill

UPDATE Bugs SET status = 'CLOSED' WHERE status = 'OPEN';


I think everyone is missing the point of why you want to wrap any data into an object....

In that above example you want to close every open bug... Lets assume each bug is assigned to a user, every bug has a set of history assigned to it... Just executing that query doesnt give you the flexibility of communicating to the user whos bug it is, the user who the bug is assigned to, adding a history to the bug... But if you select al lthe bugs that have a status of open, then iterate them, run $bug->close(); then the close method has the flexibility of communicating to the users tied to that bug and flagging a history... or doing anything else that you would want to do.

Its all a matter of preference ... Im a _HUGE_ fan of using objects instead of assoc arrays... why wouldnt you? It adds flexibility and a place in code to encapsulate logic for that data set.

Bill Karwin said...

Hi @Jason, thanks for your comment. Certainly, there are advantages to wrapping an database entity with an object in your application. See reason #3 at the top of my post here.

The UPDATE Bugs statement is meant to be a concrete example of an operation that is ordinary in SQL--modifying data without fetching it first--but isn't supported by the typical ORM style.

True, maybe you want some actions to be associated with the change per row, as you describe, and it's easier to code in your preferred language than in the RDBMS trigger language.

But what about cases when you don't need to code associated actions? Have you ever tried to update 10 million rows by fetching them one at a time and saving each row back to the database individually?

Jason said...

@bill:

I think there is a place for raw SQL... those places include tool sets that arent built into a webapp. To write a migration tool or a utility using an ORM isn't logical.
And having a user, using a web interface, update 10 million rows is also probably not in anyones list of best practices.

Pooria said...

My reasons for using ORM:

You dont write SQL for CRUD operations, less code to write, easier to maintain.

Domain objects are reusable, they have no application-specific SQL code stuffed in them that couples them tightly to the current application's (or it's current form's) database schema.

Database schema changes are shielded from our OO representation; no need to see if our objects or our queries still work, after a schema change, as long as we reconfigure the mappings of the changed parts.

jchen100228 said...

"And having a user, using a web interface, update 10 million rows is also probably not in anyones list of best practices."

Then I guess Google should go out of business. They let you search (not update) billions of pages in a few seconds.

Try that with your favorite ORM tool.


I'm of the opinion that:

1. It costs more
2. It confuses people
3. That's considered a good thing by people who want to make money
4. When performance REALLY matters, the orm is quickly discarded

I think for ages developers have answered the question "where shall I put my code obfuscation?" with "in the data access layer." You can't obfuscate the front end easily without messing up the event flow of the application. The database is readily understood by anyone who understands sql, and its level of normalization is easily determined and unnecessary complexity is easy to identify. The DAL sits in the middle--and since it often goes unread, it's a great place to obfuscate.

These efforts at obfuscation eventually grew into a ORM tools that are so WELL obfuscated it appears they actually serve a useful purpose.

The poor processer disagrees. It knows it's doing unnecessary work for no reason at all.

Bill Karwin said...

@Maria: Thanks for your comment. However, I have deleted your subsequent comment which is not related to ORMs at all, it's just a trackback for your garment business. Please do not use my blog for irrelevant trackbacks.

davidsiew said...

Hey Bill,

You've written this blog post 2 years ago now. Do you still stand by your assertions?

We are currently using NHibernate. On the surface it seems great but the amount of data being returned is ridiculous. I know that this can be fine tuned but the amount of time to learn the tool, I could have finished the application already.

So what are your thoughts on the matter as of 2011?

Bill Karwin said...

@davidsiew: Hi, thanks for your comment. So, has anything changed in the ORM world that might make me change my opinion of ORM's now versus two years ago?

No. ORM's help to improve development productivity. To some extent they do succeed in this goal (once you learn to use the ORM, which we agree has a learning curve).

The resulting application often has runtime performance penalties that persist long after your manager has forgotten the time you saved from the development schedule by using the ORM.

vincode said...

what do you think about the performance of ORM? I found that using mysql native query will produce a better performance though produce a messy code too :D

vincode said...

what do you think about the performance of ORM? I found that using mysql native query will produce a better performance though produce a messy code too :D

Bill Karwin said...

Hi @vincode, thanks for your comment. You're right, hand-crafting SQL queries can give you the opportunity to optimize them, where relying on the query generated by an ORM is not good at performance.

Some people say, "but for simple CRUD operations, the query is so obvious that there should be no way the ORM produces anything different from your best hand-crafted SQL. Not always true.

I have found that the more sophisticated the ORM, the more it tries to account for advanced query types, the more likely it is to turn a straightforward CRUD query into a ridiculous 10-way left join, because you might need columns from related tables, etc.

It's a good practice to use an ORM for its productivity benefits, then profile your application to identify bottlenecks. Rewrite the most important parts by hand to relieve those bottlenecks.

Think of C/C++ programmers who write in assembler code for a few code blocks where performance is crucial. Even the best compiler can't produce optimal code in every case.

Tim Luyten said...

I completly agree with bill that there is no good enough reason to use orm-tools.
Most people tend to forget that this introduces an extra dependency and therefore an extra risk into your project.
A simple library or some codegeneration will satify the request for faster development.
Also you as a developer can always better analyse the request for data, meaning you have a context that no orm mapper could translate for you, therefore optimizing your dataaccess will be difficult because you have to explain your intend to the tool, which requires a deep knowledge about the tool that not every developer has ...

Unknown said...

Great article. I fully support your views here. The initial development boost does not outweigh the overall performance penalty, slow down in refactoring and additional orm-specific skills required by to-be-hired employees.

Tom Graham said...

Great post. If you are already good at SQL, how will you ever recoup your time to learn the ORM and integrate it into your app? Especially when it comes to legacy databases (which for me, it always has-- I'm always stuck with the vagueries of some previous developer's db. ), you'll lose a huge amount of time getting the ORM to map the fields in the database. Then you'll still have to write all the queries you would have had to do in SQL.

hdiv said...

Using SQL is against OOP programming as it is not dealing with objects. Updating Huge amount of data is Database mgmt activity Vs any Business application Logic.For regular business applications ORM is perfect excluding batch or handling CRUD on Huge amount of data.
Also If you design your software where DB is not in Critical path then it much faster than having in DB in critical path.

ms4py said...

The problem in the discussion here is probably that almost everyone is talking about Java or another language which implies a long development cycle so a plain SQL database backend carries no weight in terms of development time. But for a rapid development language like Python or Ruby plain SQL is really a no-go.

@Bill IMO you have actually never worked with a good ORM implementation. Everyone knows that Hibernate is hard to learn and harder to control.

For example you could have a look at SQLAlchemy:

1. Easy to get started.
2. Does not implicit abstracts SQL away. You can generate SQL expressions just out of the language. [1]
3. Explicit eager/lazy loading per relationship and per query. [2]

[1]: http://docs.sqlalchemy.org/en/rel_0_7/core/tutorial.html

[2]: http://docs.sqlalchemy.org/en/rel_0_7/orm/loading.html

@jchen100228 Google isn't using a relational backend for its search, so your example is nonsense.

webgovernor said...

Excellent points! I'm curious how you feel about the recent trend in ORM hybrids? All they do is map rows and properties to objects, and allow you to implement data translation before updating the database? The example I'm thinking of is PyORMish, but there are others.

Personally, I'd rather use these hybrids over plain queries or an ORM.

Bill Karwin said...

@webgovernor, thanks, I took a look at the documentation and examples of PyORMish, but don't see anything that changes my answers to the "Ten Advantages of an ORM" above.

PyORMish looks fairly standard. It may help developers avoid writing tedious column-mapping code, though they have traded that for writing ORM classes. Potentially you must write multiple PyORMish classes for the same table, to be used when you want to join to different related tables.

Joos Kiener said...

I don't agree with many points brought up against ORMs. The

UPDATE Bugs SET status = 'CLOSED' WHERE status = 'OPEN';

example is kind of stupid as the ORMs I know easily let you execute native queries.

Performance can be an issue but than can also be addressed with caching and/or hardware (speak RAM and SSD). In most cases it won't be an issue because most applications do not have 10'000 requests per second. I would guess many, many company internal apps are more in the 1 request/minute or less category and hence performance is a non-issue.

If you do complex stuff and complex queries they question one needs to ask if either the domain model must be changed or if you are doing reporting through the ORM.

I haven't found a way to fetch a joined result set from two entities

I'm not even sure what is meant by this? If the relationships are defined correctly if you select parent, all childs will be selected too. -> joined result. And if I understood that wrong you can, again, if you really, really need to, run a native query. It's not like an ORM prevents your from running plain old SQL. You could theoretically build a DAL with hibernate using native queries only and that would pretty sure be faster to develop than using plain JDBC or ADO.net.

Bill Karwin said...

Hi Joos,

Actually, I think we are in agreement. The problem is to make an ORM perform better with complex queries, and your solution is to bypass the ORM in such cases and write SQL.

That's like an instant soup mix with instructions, "first, cook a pot of delicious soup from scratch, then add contents of instant soup mix packet."

If you can link to a tutorial or example showing how to fetch the result of a JOIN query in the Hibernate ORM -- *without* writing a custom DAL or native SQL -- I'd be interested in that.