Read more

MySQL: Do not use "WHERE id IN (SELECT ....)"

Tobias Kraze
October 27, 2011Software engineer at makandra GmbH

Note: This applies specifically to MySQL. In PostgreSQL for example, this is not an issue.

Illustration web development

Do you need DevOps-experts?

Your development team has a full backlog? No time for infrastructure architecture? Our DevOps team is ready to support you!

  • We build reliable cloud solutions with Infrastructure as code
  • We are experts in security, Linux and databases
  • We support your dev team to perform
Read more Show archive.org snapshot

If you care about performance, never use a query like

UPDATE users SET has_message = 1 WHERE users.id IN (SELECT user_id FROM messages)

MySQL does not optimize this and seems to scan the temporary table, which isn't indexed, for every row in the update statement. This applies to other statements than UPDATE as well.

Instead, either use a JOIN like

UPDATE users INNER JOIN messages ON messages.user_id = users.id SET has_message = 1

or pipe it through ruby with something like

user_ids = ActiveRecord::Base.connection.select_values(Message.select(:user_id).to_sql) # Rails 2
user_ids = Message.all.pluck(:user_id) # Rails 3+
User.where(:id => user_ids).update_all(:has_message => 1)

Be aware that MySQL can take many but not too many arguments. Passing a list of 100'000 arguments to an IN statement won't work.

Posted by Tobias Kraze to makandra dev (2011-10-27 13:59)