There is a way to use
multiple databases
Show archive.org snapshot
in Rails.
You may have asked yourself how you're able to keep your test databases clean, if you're running multiple databases with full read and write access at the same time. This is especially useful when migrating old/existing databases into a new(er) one.
Your database.yml
may look like this:
default: &default
adapter: postgresql
encoding: unicode
username: <%= ENV['DATABASE_USER'] %>
host: <%= ENV['DATABASE_HOST'] %>
development:
primary:
<<: *default
database: my_app_development
migration:
<<: *default
database: my_app_migration
test: &test
primary:
<<: *default
database: my_app_test_<%= ENV['TEST_ENV_NUMBER'] %>
pool: 20
migration:
<<: *default
database: my_app_migration_test_<%= ENV['TEST_ENV_NUMBER'] %>
pool: 20
Let's now have a look at your tests. After writing to the my_app_migration_test_*
databases in your tests, you will notice that the records you create are not deleted and will bleed into your specs the next time you run them.
You probably have DatabaseCleaner configured to take care of not bloating your test database with old records:
RSpec.configure do |config|
config.before(:suite) do
DatabaseCleaner.clean_with(:deletion)
end
config.before(:each) do
DatabaseCleaner.strategy = :transaction
end
config.before(:each, transaction: false) do
DatabaseCleaner.strategy = :deletion
end
config.before(:each) do
DatabaseCleaner.start
end
config.after(:each) do
DatabaseCleaner.clean
end
end
Unfortunately, DatabaseCleaner is not aware of multiple databases in your application by default and will only clean the primary database. You need to explicitly tell it about the other database(s) by specifying the ActiveRecord
ORM together with an additional db
parameter. The db
param however is not the name of your database, but the name of the model that connects to the second database. For example:
class MigrationRecord < ActiveRecord::Base
self.abstract_class = true
connects_to database: { writing: :my_app_migration, reading: :my_app_migration }
# You never want to allow write access to the database you are migrating, unless you are in a test environment to setup pre-defined records to check if the migration works properly with those.
after_initialize :readonly!, unless: -> { Rails.env.test? }
end
When defining multiple adapters in your database.yml
, Rails needs to decide which one to pick. Any instance of ActiveRecord::Base
will connect to the primary
database by default - unless you told it to use another database using connects_to
.
The MigrationRecord
, on the other hand, connects to my_app_migration
. It will act as the base class for all models that read and write their data from the migration database. This means we can now specify this second database when using DatabaseCleaner:
RSpec.configure do |config|
config.before(:suite) do
DatabaseCleaner[:active_record].clean_with(:deletion)
DatabaseCleaner[:active_record, db: MigrationRecord].clean_with(:deletion)
end
config.before(:each) do
DatabaseCleaner[:active_record].strategy = :transaction
DatabaseCleaner[:active_record, db: MigrationRecord].strategy = :transaction
end
config.before(:each, transaction: false) do
DatabaseCleaner[:active_record].strategy = :deletion
DatabaseCleaner[:active_record, db: MigrationRecord].strategy = :deletion
end
config.before(:each) do
DatabaseCleaner.start
end
config.after(:each) do
DatabaseCleaner.clean
end
end
This way all of your databases will be cleared correctly. You are of course free to add as many databases as you like following the same principle.
How it works internally
In case you're interested what DatabaseCleaner does internally and why this works, here is your explanation:
DatabaseCleaner uses an internal hash subclass to store a list of cleaners. A cleaner itself is identified by a tuple consisting of an ORM-identifier and an optional database kwarg (specified using the model representative). It may be added by calling the indexer method on the DatabaseCleaner
module: DatabaseCleaner[:active_record, db: Foo]
.
The DatabaseCleaner gem itself does not define any cleaners. By default, it's empty.
When using methods like DatabaseCleaner.start
or DatabaseCleaner.clean
or setting a strategy (DatabaseCleaner.strategy = :transaction
), DatabaseCleaner iterates over all cleaners and executes your statement for each of them.
This means whenever you use this direct call in your code, you actually perform this on a list of cleaners, there is no cleaner singleton or such thing. You still probably never noticed this because there is only one cleaner defined. But where does it come from? The gem itself does not define it and just builds a framework for concrete implementations. This is were another gem becomes important: database_cleaner-active_record
. It provides the default ActiveRecord cleaner instance and is also installed in your ruby environment. Of course, it only applies in case you specify ActiveRecord as ORM.
It has the following line:
DatabaseCleaner[:active_record].strategy = :transaction`
You can see that there is no db
parameter. When you omit that parameter, DatabaseCleaner will fallback to ActiveRecord::Base
as connection base in the strategies that do not use SQL. That's why in the code example above, we could use DatabaseCleaner[:active_record].clean_with(:deletion)
without having to specify ApplicationRecord
or ActiveRecord::Base
manually.
Warning
Be careful not to accidentally call any method directly on the
DatabaseCleaner
module whenever you want to specify different configurations for multiple databases. This will overwrite any individual configuration.
It's fine though to use it e.g. for triggering the cleaning process, as all cleaners should run then... - or if you only have one database, of course.
As we've set identical cleaning options for each database above, we could also have used e.g. DatabaseCleaner.strategy = :deletion
after defining the second database just once by using DatabaseCleaner[:active_record, db: MigrationRecord]
, but it's better readable and more convenient in case you want to change a strategy in the future.