Rails gives you
migrations
Show archive.org snapshot
to change your database schema with simple commands like add_column
or update
.
Unfortunately these commands are simply not expressive enough to handle complex cases.
This card outlines three different techniques you can use to describe nontrivial migrations in Rails / ActiveRecord.
Note that the techniques below should serve you well for tables with many thousand rows. Once your database tables grows to millions of rows, migration performance becomes an issue. We will talk about this in another card.
1. Use the full power of SQL
Plain SQL can be very powerful. Once you get to know it, it can actually be pleasant to write complex queries or updates with it.
As a first example, we have Post has_many :followers
. For better performance, we want to cache the number of followers in Post#followers_count
.
Now we need to write a migration that not only adds the followers_count
column, but also updates it with the correct count for existing posts.
The migration below uses
subqueries
Show archive.org snapshot
for this:
class AddFollowersCountToPosts < ActiveRecord::Migration
def up
add_column :posts, :followers_count, :integer
update <<~SQL.squish
UPDATE posts
SET followers_count = (
SELECT COUNT(*)
FROM followers
WHERE followers.post_id = posts.id
)
SQL
end
def down
remove_column :posts, :followers_count
end
end
As a second example, we have Project has_many :activities
. Each project has a category. We now want to move this category from Project
to Activity
, so each activity can have a category of its own.
Now we need to write a migration that not only adds the category
column to the activities
table, but also copies the category from the project to each of its activities.
The migration below uses an UPDATE with JOIN for this:
class MoveCategoriesFromProjectToActivity < ActiveRecord::Migration
def up
add_column :activities, :category, :string
update <<-SQL.squish
UPDATE activities
LEFT JOIN projects ON activities.project_id = projects.id
SET activities.category = projects.category
SQL
remove_column :projects, :category
end
def down
# not implemented
end
end
Note that the <<-SQL
bit is a HEREDOC, which is a nice way to write strings that span multiple lines.
Note that this example uses MySQL. Refer to this card if you use PostgreSQL.
If you need more than one JOIN refer to this card
2. Embed models into your migrations
You can never call models in a migration because your models will eventually evolve away from your migrations. This will block future developers (or application instances) from ever migrating a fresh database. Fortunately there is a way to decouple migrations from models by embedding models into the migration Show archive.org snapshot . This way you can freely change your application model without affecting old migrations.
As an example, we have Article has_many :vendors
. We now introduce a boolean flag Vendor#current
, which determines whether the article is currently being procured from that vendor.
We now need to write a migration, that not only adds the current
column, but also flags the first vendor for each article as "current". The migration below uses embedded models to perform that update:
class AddCurrentToVendor < ActiveRecord::Migration
class Vendor < ActiveRecord::Base
end
class Article < ActiveRecord::Base
has_many :vendors, :class_name => 'AddCurrentToVendor::Vendor', :order => 'created_at'
end
def up
add_column :vendors, :current, :boolean
Vendor.reset_column_information # reset database caching
Vendor.update_all(:current => false)
Article.reset_column_information # reset database caching
Article.find_each do |article|
first_vendor = article.vendors.first
if first_vendor
first_vendor.update_attributes!(:current => true)
end
end
end
def down
remove_column :vendors, :current
end
end
This technique has one important caveat: It is very easy to unintentionally use the actual model instead of the embedded model. Note how (in the example above) our two classes were namespaced into the migration class, so Vendor
becomes AddCurrentToVendor::Vendor
. Also note how when defining the association Article has_many :vendors
, we must tell Rails that the class name is actually AddCurrentToVendor::Vendor
. If you forget to do set the :class_name
for a single association, you are using the actual Vendor
class in app/models/vendor.rb
, leading to the exact problems you wanted to avoid with this technique.
If you are unsure if you applied this technique correctly, here is a simple test: Delete all files from app/models/*
and run your migration. If it completes successfully, you are golden.
Also note the use of
reset_column_information
Show archive.org snapshot
. As a rule of thumb, call it once for every migration model, as Rails might have cached an old schema. In the example above, the vendor
table just got a new column which would silently be ignored by the following call to update_all
. But you should also use reset_column_information
if you don't alter the schema in the current migration as it might still cause problems when migrating from scratch due to the caching of latter migrations.
3. Use helpers methods from the database adapter
If your migration needs to analyse existing data, the database adapter gives you methods like
select_rows
Show archive.org snapshot
or
select_all
Show archive.org snapshot
. These methods return query results as simple arrays, hashes, strings, numbers, etc.:
select_all('SELECT * FROM articles')
# => [ { 'name': 'Bioshock Infinite', 'price': 40.0 }, { 'name': 'EVOLAND', 'price': 19.0 } ]
As an example, we have Article has_many :vendors
. We now introduce a boolean flag Vendor#current
, which determines whether the article is currently being procured from that vendor.
We now need to write a migration, that not only adds the current
column, but also flags the first vendor for each article as "current". The migration below uses select_all
to analyse existing records and build UPDATE
statements with the results:
class AddCurrentToVendor < ActiveRecord::Migration
def up
add_column :vendors, :current, :boolean
articles = select_all('SELECT * FROM articles')
articles.each do |article|
first_vendor = select_one("SELECT * FROM vendors WHERE article_id = #{quote article['id']}")
if first_vendor
update("UPDATE vendors SET current=#{quoted_true} WHERE id=#{quote first_vendor['id']}")
end
end
end
def down
remove_column :vendors, :current
end
end
Also see Rails: Talking to the database without instantiating ActiveRecord objects.