Read more

Differences between transactions and locking

Henning Koch
April 07, 2015Software engineer at makandra GmbH

Web applications can be used by multiple users at the same time. A typical application server like Passenger has multiple worker processes for a single app. In a distributed deployment setup like we use at makandra Show archive.org snapshot you will even have multiple application servers, each with their own worker pool.

Illustration web development

Do you need DevOps-experts?

Your development team has a full backlog? No time for infrastructure architecture? Our DevOps team is ready to support you!

  • We build reliable cloud solutions with Infrastructure as code
  • We are experts in security, Linux and databases
  • We support your dev team to perform
Read more Show archive.org snapshot

This means that your code needs to deal with concurrent data access. The two main tools we use to cope with concurrency are database transactions and distributed locks. These two are not interchangeable. You can't use a transaction when you need a lock. You can't use a lock when you need a transaction.

This article explains the differences between transactions and locking. Examples will be given for the ActiveRecord ORM (from Ruby on Rails), but apply to all technologies.

Use transactions to group related changes

Use a transaction to ensure a list of changes is applied either completely or not at all.

Example: A method copy_invoice should create a copy of a given Invoice record. Here is a bad implementation that does not use transactions:

def copy_invoice(original)
  duplicate = Invoice.create!(original.recipient)
  original.items.each do |item|
    duplicate.items.create!(article: item.article, amount: item.amount)
  end
end

The implementation above has two issues:

  1. If one of the items fails to create (e.g. due to a validation error), an incomplete Invoice copy will remain in the database.
  2. If another thread or process fetches invoices before copy_invoice has terminated, it will see an incomplete invoice (e.g. an invoice without items)

The method can be improved by wrapping the operation in a transaction:

def copy_invoice(original)
  Invoice.transaction do
    duplicate = Invoice.create!(original.recipient)
    original.items.each do |item|
      duplicate.items.create!(article: item.article, amount: item.amount)
    end
  end
end

The transaction(&block) method is available on all ActiveRecord classes or record instances. It doesn't matter on which object you call it, since they all delegate to the current thread's database connection Show archive.org snapshot . If you don't have an ActiveRecord class in your scope, you can always use ActiveRecord::Base.transaction do ... end.

By using a transaction we have grouped together the creation of the copied invoice and the creation of its items. This list of changes (e.g. 12 INSERT statements) will be committed in a single atomic operation, or not at all. This has two desirable effects:

  1. Other threads or processes will not see an incomplete copy until copy_invoice has terminated.
  2. The copy of the invoice (including all of its items) either occurs as a whole, or not at all. If an exception is raised at any point, the transaction is rolled back, and the exception is re-raised.

When should I use a transaction?

Whenever you make an operation that affects more than a single database row, you should always use a transaction.

Note that in Rails, ActiveRecord::Base#save automatically opens a transaction. Hence changes you make in callbacks, nested attributes processing, etc. will automatically run inside a transaction.

What happens if two transactions run at the same time?

Nothing special happens if the "same" transaction (e.g. the copy_invoice method above) runs at the same time. They will simply run at the same time. There is no guarantee of the order.

When two concurrently running transactions are done, they will each try to commit their changes in one, atomic commit. This may or may not succeed. E.g. if both transactions try to insert the same unique key, whatever transaction commits first will succeed. The other transaction will get an error from the database adapter and all of its changes will be rolled back.

Note that when two transaction run concurrently in two threads, each thread doesn't see the other transaction's changes until they are committed successfully. It does however see its own changes (this is a dumbed-down explanation, reality is much more complicated).

ActiveRecord: What happens if I raise an exception in a transaction?

Exceptions within a transaction are typically caused from a failed validation (ActiveRecord::RecordInvalid) or from violating a database constraint (like an unique index).

If an exception is raised within an ActiveRecord transaction, all changes made within the transaction are rolled back and the exception is re-raised.

To better understand how ActiveRecord handles exceptions within a transaction, here is the implementation of ActiveRecord::Base#transaction in pseudo-code:

class ActiveRecord::Base

  def transaction(&block)
    connection.start_transaction
    block.call
    connection.commit_transaction
  rescue Exception => e
    connection.rollback_transaction
    raise e
  end

end

Why do I get deadlocks when using transactions?

Since the purpose of this article is to show that transactions and locks are not the same, you might be surprised to experience deadlocks when committing a transaction. E.g. you might see an error like this:

ActiveRecord::StatementInvalid: Mysql::Error: Lock wait timeout exceeded; try restarting transaction

This has nothing to do with the application-level locks that we talk about below. What happens here is that when MySQL makes changes, it locks rows in the database to ensure that the change will applied consistently. When two threads (two database connections) try to lock the same rows in different order, you might get a deadlock.

For instance, transaction A wants to change row #1 and row #2. Transaction B wants to change the same rows, but in different order (row #2, then row #1). Here is what happens:

  • Transaction A locks row #1
  • Transaction B locks row #2
  • Transaction A tries to lock row #2, but can't
  • Transaction B tries to lock row #1, but can't
  • Both transactions wait forever
  • MySQL times out after 50 seconds and raises an error in both database connections
  • Both Rails worker processes raise ActiveRecord::StatementInvalid with a confusing error message

The internal locking model that MySQL uses to ensure consistency is very complex Show archive.org snapshot . Also MySQL might lock many more rows that you might expect, especially when you do not have proper database indexes for the relevant statement. For updates without indexes, every row in a table gets locked, for example.

An application-level lock as described below can help you prevent deadlocks caused by concurrent transactions.
We also give some hints for debugging deadlocks in MySQL.

Use locks to prevent concurrent data access

Use locks to ensure that a critical piece of code only runs in a single thread at the same time. A lock is also called a "mutex".

Example: A method transfer_money should transfer an amount of money from one account to another. Here is a bad implementation that does not use locking:

def transfer_money(source_id, target_id, amount)
  source_account = Account.find(source_id)
  target_account = Account.find(target_id)  
  source_account.balance -= amount
  target_account.balance += amount  
  source_account.save!
  target_account.save!
end

The example above has all the issues that the transaction-less copy_invoice example at the beginning of this article had. In addition, it might lose money transactions.

Let's say two users try to transfer 5 units of currency from the same source account to the same target account. Here is what can happen:

  • Thread A retrieves both accounts. It sees that source_account and target_account both have a balance of 100.
  • Thread B retrieves both accounts. It sees that source_account and target_account both have a balance of 100.
  • Thread A sets source_account.balance to 95 and target_account.balance to 105.
  • Thread A saves both accounts and terminates.
  • Thread B also sets source_account.balance to 95 and increases target_account.balance to 105.
  • Thread B saves both accounts and terminates.

We just lost the first money transfer. The account balances should be 90 and 110, but they are 95 and 105. We call this the "lost update problem". You cannot fix this with a transaction. In fact, if you think that wrapping transfer_money in a transaction would help, you should re-read this article from the top.

What you should do instead is to wrap critical code in a lock. The lock ensures only a single thread may access it at a single time:

def transfer_money(source_id, target_id, amount)
  Account.with_advisory_lock('money-transfer-lock') do
    Account.transaction do
      source_account = Account.find(source_id)
      target_account = Account.find(target_id)  
      source_account.balance -= amount
      target_account.balance += amount  
      source_account.save!
      target_account.save!
    end
  end
end

The code example above uses the with_advisory_lock gem to synchronize multiple threads or worker processes on a given name ('transfer-money-lock' in the example). Only a single thread can be inside the lock at any given time. All other threads will block on with_advisory_lock until the first thread exits the block and releases the lock.

We additionally wrap the money transfer in a transaction just in case the second save! fails for some reason (like a lost database connection).

But I can fix the code without using a lock!

You really can't. Please don't try things like reloading the accounts before saving to make sure they're really totally positively up to date.

Also please don't write your own application-level protocol for detecting other threads manipulating the same resource. Just use a lock.

But I can write me a better lock implementation using files / memcached / Redis / INSERT_TECHNOLOGY_HERE!

Please don't unless you've been working with distributed code for a number of years.

Locks are hard, especially in a distributed deployment setup where you have multiple worker processes, or multiple app servers with their own worker pool each. Unless you really know what you're doing, prefer to lock against a database cluster as shown above.

But if this is true it would mean all my code is broken!

It probably is. Most web application code is not thread-safe due to missing locks. It's just that you require a lot of concurrent users to make failures become frequent enough that they will notify you, the developer.

Another reason why you're not seeing these failures in development that you are using a development server like Webrick that has a single worker process. Thus you are never experiencing concurrency anyway. Because of this you should instead use Passenger for development.

Henning Koch
April 07, 2015Software engineer at makandra GmbH
Posted by Henning Koch to makandra dev (2015-04-07 13:16)