Read more

Advice: Reduce scopes with joins to simple IN-queries

Henning Koch
October 24, 2012Software engineer at makandra GmbH

In theory you can take any scope and extend it with additional joins or conditions. We call this chaining scopes.

Illustration web development

Do you need DevOps-experts?

Your development team has a full backlog? No time for infrastructure architecture? Our DevOps team is ready to support you!

  • We build reliable cloud solutions with Infrastructure as code
  • We are experts in security, Linux and databases
  • We support your dev team to perform
Read more Show archive.org snapshot

In practice chaining becomes problematic when scope chains grow more complex. In particular having JOINs in your scope will reduce the scope's ability to be chained with additional JOINs without crashes or side effects. This is because ActiveRecord doesn't really "understand" your scope chain, it only mashes together strings that mostly happen to look like a MySQL query in the end.

I don't generally advice against using scopes with JOINs. What I do recommend is: In cases where you expect your scope to be chained further by consumers of your class, reduce the scope to a simple IN-query without any JOINs.

Example for a problematic scope

Here is a scope that uses a JOIN:

Site.joins(:user).where(:users => { :name => 'Bruce' })

This scope will expand to a SQL query like this:

SELECT * FROM sites INNER JOIN users WHERE sites.user_id = sites.id AND users.name = 'Bruce'

Because the scope uses a JOIN, there are limits in how we can extend the chain. For example, any additional joins or includes on the user association may raise an error or modify your query in unexpected ways.

How to reduce the scope to an IN query

The Edge Rider Show archive.org snapshot gem gives your scopes a method #to_id_query:

Site.joins(:user).where(:users => { :name => 'Bruce' }).to_id_query

#to_id_query will immediately run an SQL query where it collects all the IDs that match your scope:

SELECT sites.id FROM sites INNER JOIN users WHERE sites.user_id = sites.id AND users.name = 'Bruce'

It now uses these IDs to return a new scope that has no joins and a single condition on the id column:

SELECT * FROM sites WHERE sites.user_id IN (3, 17, 103)
Posted by Henning Koch to makandra dev (2012-10-24 12:37)