Use DatabaseCleaner with multiple test databases

There is a way to use multiple databases Show archive.org snapshot in Rails.
You may have asked yourself how you're able to keep your test databases clean, if you're running multiple databases with full read and write access at the same time. This is especially useful when migrating old/existing databases into a new(er) one.

Your database.yml may look like this:

default: &default
  adapter: postgresql
  encoding: unicode
  username: <%= ENV['DATABASE_USER'] %>
  host: <%= ENV['DATABASE_HOST'] %>

development:
  primary:
    <<: *default
    database: my_app_development

  migration:
    <<: *default
    database: my_app_migration
    
test: &test
  primary:
    <<: *default
    database: my_app_test_<%= ENV['TEST_ENV_NUMBER'] %>
    pool: 20
  migration:
    <<: *default
    database: my_app_migration_test_<%= ENV['TEST_ENV_NUMBER'] %>
    pool: 20

Let's now have a look at your tests. After writing to the my_app_migration_test_* databases in your tests, you will notice that the records you create are not deleted and will bleed into your specs the next time you run them.

You probably have DatabaseCleaner configured to take care of not bloating your test database with old records:

RSpec.configure do |config|
  config.before(:suite) do
    DatabaseCleaner.clean_with(:deletion)
  end

  config.before(:each) do
    DatabaseCleaner.strategy = :transaction
  end

  config.before(:each, transaction: false) do
    DatabaseCleaner.strategy = :deletion
  end

  config.before(:each) do
    DatabaseCleaner.start
  end

  config.after(:each) do
    DatabaseCleaner.clean
  end
end

Unfortunately, DatabaseCleaner is not aware of multiple databases in your application by default and will only clean the primary database. You need to explicitly tell it about the other database(s) by specifying the ActiveRecord ORM together with an additional db parameter. The db param however is not the name of your database, but the name of the model that connects to the second database. For example:

class MigrationRecord < ActiveRecord::Base

  self.abstract_class = true
  
  connects_to database: { writing: :my_app_migration, reading: :my_app_migration }
  
  # You never want to allow write access to the database you are migrating, unless you are in a test environment to setup pre-defined records to check if the migration works properly with those.
  after_initialize :readonly!, unless: -> { Rails.env.test? }
  
end

When defining multiple adapters in your database.yml, Rails needs to decide which one to pick. Any instance of ActiveRecord::Base will connect to the primary database by default - unless you told it to use another database using connects_to.
The MigrationRecord, on the other hand, connects to my_app_migration. It will act as the base class for all models that read and write their data from the migration database. This means we can now specify this second database when using DatabaseCleaner:

RSpec.configure do |config|
  config.before(:suite) do
    DatabaseCleaner[:active_record].clean_with(:deletion)
    DatabaseCleaner[:active_record, db: MigrationRecord].clean_with(:deletion)
  end

  config.before(:each) do
    DatabaseCleaner[:active_record].strategy = :transaction
    DatabaseCleaner[:active_record, db: MigrationRecord].strategy = :transaction
  end

  config.before(:each, transaction: false) do
    DatabaseCleaner[:active_record].strategy = :deletion
    DatabaseCleaner[:active_record, db: MigrationRecord].strategy = :deletion
  end

  config.before(:each) do
    DatabaseCleaner.start
  end

  config.after(:each) do
    DatabaseCleaner.clean
  end
end

This way all of your databases will be cleared correctly. You are of course free to add as many databases as you like following the same principle.

How it works internally

In case you're interested what DatabaseCleaner does internally and why this works, here is your explanation:

DatabaseCleaner uses an internal hash subclass to store a list of cleaners. A cleaner itself is identified by a tuple consisting of an ORM-identifier and an optional database kwarg (specified using the model representative). It may be added by calling the indexer method on the DatabaseCleaner module: DatabaseCleaner[:active_record, db: Foo].
The DatabaseCleaner gem itself does not define any cleaners. By default, it's empty.

When using methods like DatabaseCleaner.start or DatabaseCleaner.clean or setting a strategy (DatabaseCleaner.strategy = :transaction), DatabaseCleaner iterates over all cleaners and executes your statement for each of them.

This means whenever you use this direct call in your code, you actually perform this on a list of cleaners, there is no cleaner singleton or such thing. You still probably never noticed this because there is only one cleaner defined. But where does it come from? The gem itself does not define it and just builds a framework for concrete implementations. This is were another gem becomes important: database_cleaner-active_record. It provides the default ActiveRecord cleaner instance and is also installed in your ruby environment. Of course, it only applies in case you specify ActiveRecord as ORM.
It has the following line:

DatabaseCleaner[:active_record].strategy = :transaction`

You can see that there is no db parameter. When you omit that parameter, DatabaseCleaner will fallback to ActiveRecord::Base as connection base in the strategies that do not use SQL. That's why in the code example above, we could use DatabaseCleaner[:active_record].clean_with(:deletion) without having to specify ApplicationRecord or ActiveRecord::Base manually.

Warning

Be careful not to accidentally call any method directly on the DatabaseCleaner module whenever you want to specify different configurations for multiple databases. This will overwrite any individual configuration.

It's fine though to use it e.g. for triggering the cleaning process, as all cleaners should run then... - or if you only have one database, of course.

As we've set identical cleaning options for each database above, we could also have used e.g. DatabaseCleaner.strategy = :deletion after defining the second database just once by using DatabaseCleaner[:active_record, db: MigrationRecord], but it's better readable and more convenient in case you want to change a strategy in the future.

Dominic Beger

Say thanks

Last edit

2022-03-21

Dominic Beger

License

Source code in this card is licensed under the MIT License.

Posted by Dominic Beger to makandra dev (2022-03-21 17:46)