Careful when using database locks in transactions
We tend to use database transactions as a magic bullet to get rid of all our concurrency problems. When things get really bad, we might even throw in some locking mechanism, but then are usually done with it.
Unfortunately, transactions semantics in databases are actually very complicated, and chances are, your making some incorrect assumptions.
The MySQL innodb engine actually has four different modes for transactions:
READ UNCOMMITED gives you the least isolation between transactions (i.e. one transaction can see most of what another is doing), and
SERIALIZABLE the best (i.e. transactions are very well isolated). Conversely,
READ UNCOMMITTED gives you the best performance, while
SERIALIZABLE will be the slowest, due to a lot of locking. This also affects locking behaviour and replication performance.
The default level is
REPEATABLE READ, and it's the mode you've probably used most of the time. One major "feature" you get in this mode are consistent nonlocking reads.
The problem with consistent nonlocking reads
If you open a MySQL transaction, the first
SELECT statement will establish a "snapshot" of the database's state. All later
SELECTs will see that snapshot (and changes made within the same transaction), but not changes made outside it.
Consider this code:
# BROKEN transaction do blog = Blog.first # <-- snapshot established here blog.posts.create!(...) blog.update_attribute :posts_count, blog.posts.count end
This will, despite the transaction, break when run in parallel. That's might not be particularly surprising, after all a transaction is not a mutex.
However, even if you put in a mutex, it will still not work:
# STILL BROKEN transaction do blog = Blog.first # <-- snapshot established here acquire_some_mutex do blog.posts.create!(...) blog.update_attribute :posts_count, blog.posts.count end end
Each process inside the transaction will not see posts created after its snapshot was established on the first read, and so the posts_count can be off.
You can fix this by immediately acquiring a mutex outside or immediately inside the transaction:
# WORKS acquire_some_mutex do transaction do blog = Blog.first blog.posts.create!(...) blog.update_attribute :posts_count, blog.posts.count end end
This is however, a very fragile solution, since any database read done before the lock will break this.
# BROKEN AGAIN transaction do user_count = User.count # <-- snapshot established here (any query to an innodb table will do) acquire_some_mutex do blog = Blog.first blog.posts.create!(...) blog.update_attribute :posts_count, blog.posts.count end end
This is broken again. And so is using any other kind of mutex, except if that mutex is either outside or directly inside the transaction block.
Note: You can also lock an ActiveRecord row while loading it like
I'm not really sure what's generally the correct solution for these kinds of issues. There are some options:
- Do the update deferred (i.e. outside of the transaction).
- Choose another transaction isolation level.
READ COMMITTEDshould not suffer from this, but might cause other isuses. You still need to put some lock around the actual update, though.
- Immediately acquire some kind of mutex (like the
:lock => trueabove) after opening the transaction, and do this on all relevant code paths. You might want to add some assertion that you're not already inside a bigger transaction.