TLDR
Using
.includes
or.eager_load
with 1-n associations is dangerous. Always use.preload
instead.
Consider the following ActiveRecord query:
BlogPost.eager_load(
:comments
:attachments,
).to_a
(Let's assume we only have a couple of blog posts; if you use pagination the queries will be more complicated, but the point still stands.
Looks harmless enough? It is not.
The problem
ActiveRecord will rewrite this into a query using LEFT JOIN
s which looks something like this:
SELECT "blog_posts".*, "comments".*, "attachments".*
FROM "blog_posts"
LEFT OUTER JOIN "comments" ON "comments"."blog_post_id" = "blog_posts"."id"
LEFT OUTER JOIN "attachments" ON "attachments"."blog_post_id" = "blog_posts"."id";
Now assume there is a single blog post with 100 comments and 10 attachments. The OUTER JOIN
s will cause the query to return 1000 database rows, the full cross product of the blog post × all its comments × all its attachments, including 1000 redundant copies of all the fields of the blog post. If you add a third association this gets exponentially worse.
Needless to say, this is super inefficient. We've seen cases in production where such queries consume GBs of memory bringing down our servers.
The solution
Just use .preload
instead. Rails will use separate queries to preload the data. This cannot lead to a cross product.
About .includes
If you use .includes
instead of .eager_load
you might or might not be okay. In this case ActiveRecord decides via some heuristic whether to use JOIN
s or separate queries. Do not risk it.
Sometimes you need a JOIN
due to some WHERE
condition on the joined data set. In this case, use .eager_load
on the associations you need, and preload
otherwise.
As long as all your associations are 1-1 you are also okay, in this case the JOIN
might have slightly better performance.