Read more

MySQL: Careful when using database locks in transactions

Tobias Kraze
July 26, 2013Software engineer at makandra GmbH

We tend to use database transactions as a magic bullet to get rid of all our concurrency problems. When things get really bad, we might even throw in some locking mechanism, but then are usually done with it.

Illustration web development

Do you need DevOps-experts?

Your development team has a full backlog? No time for infrastructure architecture? Our DevOps team is ready to support you!

  • We build reliable cloud solutions with Infrastructure as code
  • We are experts in security, Linux and databases
  • We support your dev team to perform
Read more Show archive.org snapshot

Unfortunately, transactions semantics in databases are actually very complicated, and chances are, your making some incorrect assumptions.

The MySQL innodb engine actually has four different modes Show archive.org snapshot for transactions:

  • READ UNCOMMITTED
  • READ COMMITTED
  • REPEATABLE READ
  • SERIALIZABLE

READ UNCOMMITED gives you the least isolation between transactions (i.e. one transaction can see most of what another is doing), and SERIALIZABLE the best (i.e. transactions are very well isolated). Conversely, READ UNCOMMITTED gives you the best performance, while SERIALIZABLE will be the slowest, due to a lot of locking. This also affects locking behaviour and replication performance.

The default level is REPEATABLE READ, and it's the mode you've probably used most of the time. One major "feature" you get in this mode are consistent nonlocking reads Show archive.org snapshot .

The problem with consistent nonlocking reads

If you open a MySQL transaction, the first SELECT statement will establish a "snapshot" of the database's state. All later SELECTs will see that snapshot (and changes made within the same transaction), but not changes made outside it.

Consider this code:

# BAD
transaction do
  blog = Blog.first # Database snapshot established here
  blog.posts.create!(...)
  blog.update posts_count: blog.posts.count
end

When run in parallel, updating the posts count can go wrong. This should not be particularly surprising, after all a transaction is not a mutex.

However, even if you wrap it in a mutex, it can still break:

# STILL BAD
transaction do
  blog = Blog.first # Database snapshot established here
  acquire_some_mutex do
    blog.posts.create!(...)
    blog.update_attribute :posts_count, blog.posts.count
  end
end

Each process inside the transaction will not see posts created after its snapshot was established on the first read, and so the posts_count can be off.

You can fix this by acquiring a mutex before the first database access, i.e. directly around or inside the transaction:

# GOOD
acquire_some_mutex do
  transaction do
    blog = Blog.first
    blog.posts.create!(...)
    blog.update_attribute :posts_count, blog.posts.count
  end
end

Note: You can also lock an ActiveRecord row while loading it like Blog.first(lock: true).

Potential solutions

I'm not really sure what's generally the correct solution for these kinds of issues. There are some options:

  1. Do the update deferred (i.e. outside of the transaction).
  2. Choose another transaction isolation level. READ COMMITTED should not suffer from this, but might cause other issues. You still need to put some lock around the actual update, though.
  3. Immediately acquire some kind of mutex (like the :lock => true above) after opening the transaction, and do this on all relevant code paths. You might want to add some assertion that you're not already inside a bigger transaction.
Tobias Kraze
July 26, 2013Software engineer at makandra GmbH
Posted by Tobias Kraze to makandra dev (2013-07-26 17:38)