Posted over 7 years ago. Visible to the public.

ActiveRecord scopes [4d]

In this card we will learn to write code that scales with a large number of database records. We will do this by pushing work out of Ruby and into the database.


  • Understand what scopes ("relations") are and how they differ from arrays.

  • Understand how a purely object-oriented approach of processing data doesn't scale for a large number of database records.

  • Get into a habit of off-loading list-related calculations (filtering, mapping) to the database. Only fetch an ActiveRecord objects from the database when you actually plan to display or change it.

  • Understand that an expression like User.where(email: '') does not make an SQL query. It simply returns a scope object for further chaining with #where, #order, etc.

    Scopes will make an SQL query when you use them like an array, e.g. by iterating over them with #each. This may cause the scope to accidentally load its results prematurely. To prevent this, get into a habit of explicitely loading scopes by calling to_a when you're done filtering in the database.

  • Learn about the N+1 query problem Archive and eager loading.

  • Understand the difference between "named scopes" Archive and ad-hoc scopes using #where.


We don't like to use default scopes Archive in our projects. It can be confusing when new scopes inherit conditions out of nowhere.

If an app does have a frequently used condition (like ignoring all archived posts), we define a named scope like and use that.




Preventing scopes from loading

A scope like User.where(email: '') does not make an SQL query. It simply returns a scope object for further chaining with #where, #order, etc.

It can be confusing that typing a scope into the Rails console will immediately load results from the database. This is due to the console automatically inspecting Archive the last statement, in an attempt to visualize the results.

To prevent this, add any other expression after the scope:

scope = Movie.all; nil

Call #to_sql on a scope to see the query it will produce. This does not actually run a query:

Card.where(title: 'Foo').to_sql
# => SELECT * FROM cards WHERE title="Foo"

Advanced conditions

You can pass an array as a value in #where. ActiveRecord will make an IN() Archive query:

Card.where(id: [1,2,3]).to_sql
# => SELECT * FROM cards WHERE id IN (1, 2, 3)

If you need an operator that is not equality (=) or inclusion (IN()) you cannot pass a hash to where. You need to pass an SQL fragment instead:

Card.where('id > ?', 5).to_sql
# => SELECT * FROM cards WHERE id > 5

Chaining multiple #where will AND the conditions:

Card.where(id: [1,2,3]).where(title: 'Foo').to_sql
# => SELECT * FROM cards WHERE id IN (1, 2, 3) AND title="Foo"

Preloading associations

When you preload asociations with #preload or #includes Rails will make one query for each table:

Card.where(title: 'Foo').preload(:user).to_a  

# SELECT * FROM cards WHERE title="Foo"
# | id | user_id | title
# +----+---------+-------------------
# |  2 |    4003 | Foo
# |  4 |    4005 | Foo
# SELECT * FROM users WHERE id IN (4003, 4005)

Plucking columns

You can use #pluck to fetch a single column from a scope's result set. The result is returned as an array:

user_ids = Card.where(title: 'Foo').pluck(:user_id)
# => [4003, 4005]

This is much faster than loading an array of Card records!

A common technique is to use an array of plucked foreign keys in another query:

User.where(id: user_ids).to_sql
# => SELECT * FROM users WHERE id IN (4003, 4005)

This way, instead of one complicated query with many JOINs, you use multiple but simple queries. This can be easier to understand and may even perform better.


Cards query (makandra only)

Ask someone for a dump of the staging target and load it into your local database Archive .

In the makandra cards app, load an array of all Users that are an admin in a deck whose title begins with makandra (e.g. makandra-talent).

Try to write as little SQL as possible. Wherever possible, define conditions using #where(hash). If you do need to use an SQL fragment, make it as short as possible.

It's OK to do to use more than one query (or one scope) for this exercises. However, your solution should always use the number of queries, regardless of the number of decks.

Search for MovieDB

In your MovieDB app, adapt the existing implementation to return a scope of movies matching the following query format:

  • The query is a list of space-separated words that all need to appear in the movie title in order for the movie to match (e. g. event horizon)
  • A word can be prefixed by a minus character to exclude that word (e.g. event -horizon).

If you haven't already considered this, make sure that the returned value is not an array but a scope that can be chained further. E.g. I should be able to use it like this:'event -main').search('horizon').where('year > 1995').order(:year).to_a

Adapt both the Cucumber scenario and RSpec examples for the search functionality, using the guidelines from the "Testing" chapter in Growing Rails Applications in Practice Archive .

  • RSpec examples should cover all edge cases
  • Cucumber scenario should only cover one "happy path" to show successful integration

Open budgets in Project Hero (makandra only)

In the Project Hero app, build a view that lists all projects that have open (unclosed) budgets with unused amounts:

| Project       | Unused amount in all open budgets |
| Foo           | 200 €                             |
| Bar           | 150.000 €                         |
| Baz           | 90.000 €                          |

Rows for projects with more than 2500 € of unused budget should be highlighted visually.

To get realistic data for this exercise, ask someone for a dump of the makandra stage and load it into your local database Archive .

Push your changes into a branch prefixed with your initials and a slash (/). E.g. if your name is John Doe push it into a branch jd/curriculum-open-budgets.

Version 1: OOP

First, build it naively by iterating over all projects, iterating over the budgets and activities of each project, and summing up amounts in Ruby.

When you're done, run tail -f log/development.log, reload the report and observe the performance of your view:

  • How long does your code need to compute the report?
  • How many SQL queries does it trigger?
  • How many Project, Budget and Activity objects are instantiated to handle the request?

Version 2: With scopes

Now build a version of your view that does the same number of queries regardless of how many projects, budgets, activities, etc. you have.

This new version should also not instantiate thousands of ActiveRecord objects, regardless of the database size.


You don't need to produce the report in a single query!
Consider running multiple separate queries, then compose the results in Ruby. The #index_by Archive and #group_by Archive methods are useful for that.

When the report works: Can you move your code to a OpenBudgetReport class with a nice API? For this the OpenBudgetReport needs to do as much work as possible and offer comfortable and minimum methods to its callers (the controller and view).


For the purpose of this exercise, please don't reuse any existing classes that solve a similar problem in Project Hero, such as BudgetCalculator or Analysis::Budgets.


Using scopes to decouple logic

Scopes can be a great tool to decompose complicated queries into multiple aspects or processing steps.

To illustrate this, let's expand our requirements for the search:

  • Our search method should filter movie records as described above
  • We also have an authorization rule that says users may only see the movies that they created (Movie#user_id points to the author).
  • We also paginate results so we only show 50 records at a time.

To implement this we can expand our method like this:

def search_movies(query:, user:, page:)
  page_size = 50
  page_offset = (page - 1) * page_size

    .where(user_id:              # authorization
    .where('title ILIKE ?', "%#{query}%") # very naive search
    .offset(page_offset)                  # pagination
    .limit(page_size)                     # pagination

results = search_movies(query: 'sunshine', user: current_user, page: 3)

What's not so nice is that our method now has three responsibilities mixed together:

  • Filter movies
  • Honor authorization
  • Paginate

By leveraging scopes we can split this into three methods that each have a single responsibility:

def accessible_movies(scope, user)

def search_movies(scope, query)
  scope.where('title ILIKE ?', "%#{query}%")

def paginate(scope, page)
  page_size = 50
  page_offset = (page - 1) * page_size

scope = Movie.all
scope = accessible_movies(scope, current_user)
scope = search_movies(scope, 'sunshine')
scope = paginate(scope, 3)
  • Each method has a single responsibility. In practice they would live in different classes, like MovieSearch, Authorization, Pagination.
  • The methods can be reused for other use cases, not just search. E.g. accessible_movies is a useful starting point for any method that filters movies. And paginate works for any kind of ActiveRecord scope, not just movies.

Discuss this with your mentor.

Henning Koch
Last edit
5 months ago
Henning Koch
About this deck
We are makandra and do test-driven, agile Ruby on Rails software development.
Posted by Henning Koch to makandra Curriculum (2015-07-08 19:31)