Web applications can be used by multiple users at the same time. A typical application server like Passenger has multiple worker processes for a single app. In a distributed deployment setup like we use at makandra Show archive.org snapshot you will even have multiple application servers, each with their own worker pool.
This means that your code needs to deal with concurrent data access. The two main tools we use to cope with concurrency are database transactions and distributed locks. These two are not interchangeable. You can't use a transaction when you need a lock. You can't use a lock when you need a transaction.
This article explains the differences between transactions and locking. Examples will be given for the ActiveRecord ORM (from Ruby on Rails), but apply to all technologies.
Use transactions to group related changes
Use a transaction to ensure a list of changes is applied either completely or not at all.
Example: A method copy_invoice
should create a copy of a given Invoice
record. Here is a bad implementation that does not use transactions:
def copy_invoice(original)
duplicate = Invoice.create!(original.recipient)
original.items.each do |item|
duplicate.items.create!(article: item.article, amount: item.amount)
end
end
The implementation above has two issues:
- If one of the items fails to create (e.g. due to a validation error), an incomplete
Invoice
copy will remain in the database. - If another thread or process fetches invoices before
copy_invoice
has terminated, it will see an incomplete invoice (e.g. an invoice without items)
The method can be improved by wrapping the operation in a transaction:
def copy_invoice(original)
Invoice.transaction do
duplicate = Invoice.create!(original.recipient)
original.items.each do |item|
duplicate.items.create!(article: item.article, amount: item.amount)
end
end
end
The transaction(&block)
method is available on all ActiveRecord classes or record instances. It doesn't matter on which object you call it, since they all delegate to the
current thread's database connection
Show archive.org snapshot
. If you don't have an ActiveRecord class in your scope, you can always use ActiveRecord::Base.transaction do ... end
.
By using a transaction we have grouped together the creation of the copied invoice and the creation of its items. This list of changes (e.g. 12 INSERT
statements) will be committed in a single atomic operation, or not at all. This has two desirable effects:
- Other threads or processes will not see an incomplete copy until
copy_invoice
has terminated. - The copy of the invoice (including all of its items) either occurs as a whole, or not at all. If an exception is raised at any point, the transaction is rolled back, and the exception is re-raised.
When should I use a transaction?
Whenever you make an operation that affects more than a single database row, you should always use a transaction.
Note that in Rails, ActiveRecord::Base#save
automatically opens a transaction. Hence changes you make in callbacks, nested attributes processing, etc. will automatically run inside a transaction.
What happens if two transactions run at the same time?
Nothing special happens if the "same" transaction (e.g. the copy_invoice
method above) runs at the same time. They will simply run at the same time. There is no guarantee of the order.
When two concurrently running transactions are done, they will each try to commit their changes in one, atomic commit. This may or may not succeed. E.g. if both transactions try to insert the same unique key, whatever transaction commits first will succeed. The other transaction will get an error from the database adapter and all of its changes will be rolled back.
Note that when two transaction run concurrently in two threads, each thread doesn't see the other transaction's changes until they are committed successfully. It does however see its own changes (this is a dumbed-down explanation, reality is much more complicated).
ActiveRecord: What happens if I raise an exception in a transaction?
Exceptions within a transaction are typically caused from a failed validation (ActiveRecord::RecordInvalid
) or from violating a database constraint (like an unique index).
If an exception is raised within an ActiveRecord transaction, all changes made within the transaction are rolled back and the exception is re-raised.
To better understand how ActiveRecord handles exceptions within a transaction, here is the implementation of ActiveRecord::Base#transaction
in pseudo-code:
class ActiveRecord::Base
def transaction(&block)
connection.start_transaction
block.call
connection.commit_transaction
rescue Exception => e
connection.rollback_transaction
raise e
end
end
Why do I get deadlocks when using transactions?
Since the purpose of this article is to show that transactions and locks are not the same, you might be surprised to experience deadlocks when committing a transaction. E.g. you might see an error like this:
ActiveRecord::StatementInvalid: Mysql::Error: Lock wait timeout exceeded; try restarting transaction
This has nothing to do with the application-level locks that we talk about below. What happens here is that when MySQL makes changes, it locks rows in the database to ensure that the change will applied consistently. When two threads (two database connections) try to lock the same rows in different order, you might get a deadlock.
For instance, transaction A wants to change row #1 and row #2. Transaction B wants to change the same rows, but in different order (row #2, then row #1). Here is what happens:
- Transaction A locks row #1
- Transaction B locks row #2
- Transaction A tries to lock row #2, but can't
- Transaction B tries to lock row #1, but can't
- Both transactions wait forever
- MySQL times out after 50 seconds and raises an error in both database connections
- Both Rails worker processes raise
ActiveRecord::StatementInvalid
with a confusing error message
The internal locking model that MySQL uses to ensure consistency is very complex Show archive.org snapshot . Also MySQL might lock many more rows that you might expect, especially when you do not have proper database indexes for the relevant statement. For updates without indexes, every row in a table gets locked, for example.
An application-level lock as described below can help you prevent deadlocks caused by concurrent transactions.
We also give some hints for debugging deadlocks in MySQL.
Use locks to prevent concurrent data access
Use locks to ensure that a critical piece of code only runs in a single thread at the same time. A lock is also called a "mutex".
Example: A method transfer_money
should transfer an amount of money from one account to another. Here is a bad implementation that does not use locking:
def transfer_money(source_id, target_id, amount)
source_account = Account.find(source_id)
target_account = Account.find(target_id)
source_account.balance -= amount
target_account.balance += amount
source_account.save!
target_account.save!
end
The example above has all the issues that the transaction-less copy_invoice
example at the beginning of this article had. In addition, it might lose money transactions.
Let's say two users try to transfer 5 units of currency from the same source account to the same target account. Here is what can happen:
- Thread A retrieves both accounts. It sees that
source_account
andtarget_account
both have a balance of100
. - Thread B retrieves both accounts. It sees that
source_account
andtarget_account
both have a balance of100
. - Thread A sets
source_account.balance
to 95 andtarget_account.balance
to 105. - Thread A saves both accounts and terminates.
- Thread B also sets
source_account.balance
to 95 and increasestarget_account.balance
to 105. - Thread B saves both accounts and terminates.
We just lost the first money transfer. The account balances should be 90
and 110
, but they are 95
and 105
. We call this the "lost update problem". You cannot fix this with a transaction. In fact, if you think that wrapping transfer_money
in a transaction would help, you should re-read this article from the top.
What you should do instead is to wrap critical code in a lock. The lock ensures only a single thread may access it at a single time:
def transfer_money(source_id, target_id, amount)
Account.with_advisory_lock('money-transfer-lock') do
Account.transaction do
source_account = Account.find(source_id)
target_account = Account.find(target_id)
source_account.balance -= amount
target_account.balance += amount
source_account.save!
target_account.save!
end
end
end
The code example above uses the with_advisory_lock gem to synchronize multiple threads or worker processes on a given name ('transfer-money-lock'
in the example). Only a single thread can be inside the lock at any given time. All other threads will block on with_advisory_lock
until the first thread exits the block and releases the lock.
We additionally wrap the money transfer in a transaction just in case the second save!
fails for some reason (like a lost database connection).
But I can fix the code without using a lock!
You really can't. Please don't try things like reloading the accounts before saving to make sure they're really totally positively up to date.
Also please don't write your own application-level protocol for detecting other threads manipulating the same resource. Just use a lock.
But I can write me a better lock implementation using files / memcached / Redis / INSERT_TECHNOLOGY_HERE!
Please don't unless you've been working with distributed code for a number of years.
Locks are hard, especially in a distributed deployment setup where you have multiple worker processes, or multiple app servers with their own worker pool each. Unless you really know what you're doing, prefer to lock against a database cluster as shown above.
But if this is true it would mean all my code is broken!
It probably is. Most web application code is not thread-safe due to missing locks. It's just that you require a lot of concurrent users to make failures become frequent enough that they will notify you, the developer.
Another reason why you're not seeing these failures in development that you are using a development server like Webrick that has a single worker process. Thus you are never experiencing concurrency anyway. Because of this you should instead use Passenger for development.