hibernar.org
A friendly forum for Hibernate users
<< Back to Articles

All those confusing persistence methods

When you want to write your object to the database, the Hibernate Session has a confusing abundance of methods to choose from.
In this article, I try to make sense of them.



Perhaps the most common misconception about these methods, is that save() always generates an SQL INSERT statement in the database, and update() always generates an SQL UPDATE in the database.
This is not, true; but, unfortunately, the documentation provided by the Hibernate does nothing to clarify it.

Instead of thinking in terms of what SQL statements are generated to the database, you should think in terms of, at any given moment, what the state of your object is.

A bean in a Hibernate application can be either in a "transient" or in a "persistent" state.
A fresh "a" object, newly created by calling.
  A a=new A();
is transient, because it hasn't been associated yet to any Hibernate session.
What the methods
  session.save(a);
or
  session.persist(a);
do, is to make make that "a" object persistent, which means that the object "a" is becoming associated with the session.
Does it mean that an SQL statement will be executed immediately? Not necessarily; let Hibernate worry about when the SQL statements will be written to the database, or what kind of SQL statements those will be.
It is incorrect to think that the methods save() and persist() will always generate an INSERT statement.
Consider the following example:
example 1
1		SessionFactory sef=cfg.buildSessionFactory();
2		Session session=sef.openSession();
3		A a=new A();
4		session.persist(a);
5		a.setName("Mario");
6		session.flush();
The call to persist() at line 5 links the object a, which was transient, to the session, making it pesistent. This causes an INSERT to occur.
But later I change another property of the object (name). So at the time of flushing the session (that is, make the database consistent with the session), an extra UPDATE statement has to be generated, in order to correct the name to "Mario".
Hibernate: insert into A (NAME, ID) values (?, ?)
Hibernate: update A set NAME=? where ID=?
So, in this case, by using persist() to link the object to the session, I ended up causing 2 SQL statements to occur: an INSERT and an UPDATE.
The proof? Run the same code without the persist() call: nothing happens, no SQL code is generated.

If I had waited until I were finished modifying the object to make the object persistent, then one simple SQL INSERT would have sufficed to write that object to the database.
example 2
1		SessionFactory sef=cfg.buildSessionFactory();
2		Session session=sef.openSession();
3		A a=new A();
4		a.setName("Mario");
5		session.persist(a);
6		session.flush();
In this case, only one SQL INSERT is generated:
Hibernate: insert into A (NAME, ID) values (?, ?)
which inserts both the id and the name at the same time.

Finally, notice that, once the session has been "linked" to an object, it is not necessary to keep calling persist() on that object after every modification. (The session will know that the object has been modified).
In the following code:
example 3
1		SessionFactory sef=cfg.buildSessionFactory();
2		Session session=sef.openSession();
3		A a=new A();
5		session.persist(a);
6		a.setName("Mario");
7		session.persist(a);
8		session.flush();
... the second call to "persist()" at line 7 is unnecessary. The object was already persistent!

So, I hope this was enough to convince you that thinking in terms of "save/persist always generating an INSERT" is wrong.
In one example, calling persist() generated an INSERT and and UPDATE, in the second example, it only generated an INSERT, and in the third example it was superfluous and didn't generate anything.


OK, I understand that both save() and persist() "make an object persistent", but what is the difference between the two?".


Very little, apart from the obvious fact that persist() doesn't return and ID and save() does.
For a transient object whose ID is configured as autogenerated, for example, whether you use save() or persist() to make it persistent, the ID will be generated and available before any SQL is executed to the database, as the following 2 pieces of code prove:
example 4
SessionFactory sef=cfg.buildSessionFactory();
Session session=sef.openSession();
A a=new A();
session.persist(a);
logger.info("the generated ID is " + a.getIdA());
session.flush();

example 5
SessionFactory sef=cfg.buildSessionFactory();
Session session=sef.openSession();
A a=new A();
session.save(a);
logger.info("the generated ID is " + a.getIdA());
session.flush();

In both cases the output is the same.
INFO - the generated ID is 402881e51ca5cfbc011ca5cfbf580001
Hibernate: insert into A (NAME, ID) values (?, ?)
However, the Hibernate team recommends using save() when it is more critical to obtain and use a generated ID immediately in your code, while using persist() for long transactions that don't need to access to the IDs of those objects made persistent.
There might be some internal optimization that justifies that recommendation, but frankly, I don't know.
I do know that there was some dissension within the Hibernate team as to whether to keep those two separate methods (save() and persist()), or making it all save().
It is possible that persist() was kept as a separate method just to provide something familiar to the users of other transaction APIs (for example, JTA's EntityManager uses a method called persist()).
The bottom line is: use whichever you want.

What is the purpose of the session.update() method?
In addition to transient and persistent, there is a third state in which a bean can be in a Hibernate application: detached.
An object is detached when the session to which it was attached has been closed.
Hibernate has the ability to keep working with that object if necessary, but first it has to be "reattached" to a new session.
Consider the following example
example 6
1		SessionFactory sef=cfg.buildSessionFactory();
2		Session session=sef.openSession();
3		A a=new A();
4		a.setName("Mario");
5		session.persist(a);
6		session.flush();
7		session.close();
8
9		logger.info("using new session now");
10		Session session2=sef.openSession();
11		a.setName("Luigi");
12		session2.update(a);
13		session2.flush();
The output of this example is
Hibernate: insert into A (NAME, ID) values (?, ?)
INFO - using new session now
Hibernate: update A set NAME=? where ID=?
Notice that we could have interchanged the places of lines 11 and 12, and the example would still have produced the same output.
Notice also, that we could not have called session2.persist(a) on line 12, because Hibernate would have complained that persist() cannot be called on a detached object.
You don't have to think of session.update() as a command for generating an SQL UPDATE statement to the database. (As we saw in example 1, SQL UPDATES can occur, even if no session.update() methods are called).
What happened here, is that session.update() converted an object that was detached, (i.e., in a stale, invalid state regarding the current session), again into a persistent object.
When session2 is flushed, an SQL UPDATE occurs in order to persist to the database whatever changes happened to the object a. But it is not the session.update() itself what caused the UPDATE to occur; rather, it was all that happened afterwards: having created a new session, having reattached the object to it, having modified something on the object, and having flushed this new session.

Now you can repeat these 2 mantras:
save()/update() do not generate an SQL INSERT, they just make an object persistent
update() does not generate an SQL UPDATE, it just attaches a detached object to a session.

Repeat these 2 mantras, and you will soon find yourself in a higher state of conscience.
Soon you will stop doing what most novice Hibernate programmers do: calling an update() method every time they change something on a persistent object (expecting that an SQL UPDATE will be issued to the database because of that).
example 7
1		SessionFactory sef=cfg.buildSessionFactory();
2		Session session=sef.openSession();
3		A a=new A();
5		session.persist(a);
4		a.setName("Mario");
5		session.update(a);
6		session.flush();
7		session.close();
Line 5, in the previous example, is completely unnecessary. Object a is already persistent, attached to a session object! Changes to it will be "detected" by the session.

As a matter of fact, if you work with web applications (as most of us do), or use some sort of "managed environment", as it is almost mandatory to do for any non-trivial application, the chances of having to deal programmatically with more than one session at once are very slim.
Hence, whole applications can be written without using update() at all.


What does the saveOrUpdate() method do?
Long ago the Hibernate team invented this method which, after checking for the presence of an ID in a bean, determines whether it should call a save() or an update() on it.
In my opinion, since we will almost never use update(), even less we should use this method, if at all.
If, in some part of your application, you don't know wheter a bean is detached or transient, you should probably revise the design of your application, instead of resorting to this method.

So we can add a new mantra:
don't use update()
don't use saveOrUpdate()


What does the merge() method do?
The merge() method checks the ID of the object we are sending as a parameter; if an object with that ID already existed in the session, that session object is replaced by the parameter object. If no such ID existed, the parameter object is added to the session.

It is a useful method, that I explain in the following example:
Example 8:
Suppose we have a DB with a single table, SINGERS, which contains just an integer SINGER_ID field (autogenerated by Hibernate), and the singers's name SINGER_NAME.
At the time of starting our excercise, we have already 3 Singers in that table.
SINGER_IDSINGER_NAME
1Ricky Martin
2Madonna
3Elvis Presley

The Singer.java bean is very simple.
package test4;

public class Singer {
	private int id;
	private String name;

	public Singer(){}

	public int getId() {
		return id;
	}

	public void setId(int id) {
		this.id = id;
	}

	public String getName() {
		return name;
	}

	public void setName(String name) {
		this.name = name;
	}

}


The mapping file is equally simple. The ID of singer is an autoincremented integer.
<?xml version="1.0"?>
<!DOCTYPE hibernate-mapping PUBLIC
	"-//Hibernate/Hibernate Mapping DTD 3.0//EN"
		"http://hibernate.sourceforge.net/hibernate-mapping-3.0.dtd">

<hibernate-mapping package="test4" >
  <class name="Singer" table="SINGERS">
    <id name="id" type="int" column="ID" >
      <generator class="increment"/>
    </id>
    <property name="name" column="NAME" type="string"/>
  </class>
</hibernate-mapping>

Now, if I created a totally new (transient) object, without ID, called it "Luciano Pavarotti", and called merge() with it as a parameter, it would simply be added, with a new autogenerated ID=4.
   		SessionFactory sef=cfg.buildSessionFactory();
   		Session session=sef.openSession();
   		Singer singer=new Singer();
   		singer.setName("Luciano Pavarotti");
   		session.merge(singer);
   		session.flush();

SINGER_IDSINGER_NAME
1Ricky Martin
2Madonna
3Elvis Presley
4Luciano Pavarotti

If, instead, I used an existing ID for my new object (say, 2), and called merge() with it as a parameter, it would replace the existing object with ID=2 (Madonna).
   		SessionFactory sef=cfg.buildSessionFactory();
   		Session session=sef.openSession();
   		Singer singer=new Singer();
   		singer.setId(2);
   		singer.setName("Luciano Pavarotti");
   		session.merge(singer);
   		session.flush();

SINGER_IDSINGER_NAME
1Ricky Martin
2Luciano Pavarotti
3Elvis Presley


Although these examples used transient object as parameters, the nice thing about merge(), is that it doesn't care if the state of the parameter we sent is transient, persistent or detached.
This is very useful if he have, for example, a collection of beans with mixed state (some transient, some persistent), and want to apply the same modification to all of them, and make sure that such modification is persisted.
For example, with the same beans, mappings and values as before, suppose we execute the following client code.
		SessionFactory sef=cfg.buildSessionFactory();
		Session session=sef.openSession();
		Query query=session.createQuery("select s from Singer s");
		List singers=query.list();

		Singer bono=new Singer(); bono.setName("Bono");
		singers.add(bono);
		Singer edith=new Singer(); edith.setName("Édith Piaf");
		singers.add(edith);

		for (Singer singer: singers){
			singer.setName(singer.getName().toUpperCase());
			session.merge(singer);
		}

		session.flush();
The first singers returned by the query, are persistent.
The we add 2 more singers (Édith Piaf and Bono), which are transient.
Finally, we sent the whole list to a loop that converts the names to uppercase.
Applying merge() to the already persistent objects is inconsequential, but valid. Applying merge to the 2 new singers, adds them to the session, with a new ID.
After flushing, the SINGERS table looks like this:
SINGER_IDSINGER_NAME
1RICKY MARTIN
2LUCIANO PAVAROTTI
3ELVIS PRESLEY
4BONO
5ÉDITH PIAF

merge() is an excellent method to call in this cases, when you have to persist an object of whose state you are not sure.
To further complicate the last example, I could have created an additional session, closed it, and apply merge() also to one of its disconnected objects. But you get the idea.


In sum:
Try to think the session methods in terms of how they change the persistence state of an object, not in terms of what SQL statement thet generate.
Forget the correspondence save()=INSERT/update()=UPDATE. It is misleading and not necessarily true.
save() and persist() are more or less the same.
avoid using update() or saveOrUpdate()
use merge() when unsure of the persistence state of your object.

Question/comments about this article?
Post them in my forum, where I will always try to give you a friendly (if not expert) answer.