We tend to use database transactions as a magic bullet to get rid of all our concurrency problems. When things get really bad, we might even throw in some locking mechanism, but then are usually done with it.
Unfortunately, transactions semantics in databases are actually very complicated, and chances are, your making some incorrect assumptions.
The MySQL innodb engine actually has four different modes Show archive.org snapshot for transactions:
READ UNCOMMITTED
READ COMMITTED
REPEATABLE READ
SERIALIZABLE
READ UNCOMMITED
gives you the least isolation between transactions (i.e. one transaction can see most of what another is doing), and SERIALIZABLE
the best (i.e. transactions are very well isolated). Conversely, READ UNCOMMITTED
gives you the best performance, while SERIALIZABLE
will be the slowest, due to a lot of locking. This also affects locking behaviour and replication performance.
The default level is REPEATABLE READ
, and it's the mode you've probably used most of the time. One major "feature" you get in this mode are
consistent nonlocking reads
Show archive.org snapshot
.
The problem with consistent nonlocking reads
If you open a MySQL transaction, the first SELECT
statement will establish a "snapshot" of the database's state. All later SELECT
s will see that snapshot (and changes made within the same transaction), but not changes made outside it.
Consider this code:
# BAD
transaction do
blog = Blog.first # Database snapshot established here
blog.posts.create!(...)
blog.update posts_count: blog.posts.count
end
When run in parallel, updating the posts count can go wrong. This should not be particularly surprising, after all a transaction is not a mutex.
However, even if you wrap it in a mutex, it can still break:
# STILL BAD
transaction do
blog = Blog.first # Database snapshot established here
acquire_some_mutex do
blog.posts.create!(...)
blog.update_attribute :posts_count, blog.posts.count
end
end
Each process inside the transaction will not see posts created after its snapshot was established on the first read, and so the posts_count can be off.
You can fix this by acquiring a mutex before the first database access, i.e. directly around or inside the transaction:
# GOOD
acquire_some_mutex do
transaction do
blog = Blog.first
blog.posts.create!(...)
blog.update_attribute :posts_count, blog.posts.count
end
end
Note: You can also lock an ActiveRecord row while loading it like Blog.first(lock: true)
.
Potential solutions
I'm not really sure what's generally the correct solution for these kinds of issues. There are some options:
- Do the update deferred (i.e. outside of the transaction).
- Choose another transaction isolation level.
READ COMMITTED
should not suffer from this, but might cause other issues. You still need to put some lock around the actual update, though. - Immediately acquire some kind of mutex (like the
:lock => true
above) after opening the transaction, and do this on all relevant code paths. You might want to add some assertion that you're not already inside a bigger transaction.