Use find_each instead of all.each in Rails

In Rails, sometimes we need to iterate over all the records from a model. To achieve this people prefer to use all.each on a model. This can lead to usage of memory if there are millions (huge number of) records in the table.

Let’s say we have a model User. We will compare the usage of all.each and find_each to see which the differences and when to use find_each or all.each

all.each

When we perform all.each on a model in Rails, it loads all the records from the table in memory and then iterates over those records.

User.all.each do |user|
  puts user.name
end

When we call User.all, it will query the database to fetch all the user records into memory. The query fired when we execute User.all is shown below.

User Load (1.2ms)  SELECT "users".* FROM "users"

This can lead to a lot of memory being used as there is no limit or offset is used when querying database.

find_each

The find_each approach internally uses batches to query and get records in memory.

Here is the Rails source code for the find_each method.

def find_each(start: nil, finish: nil, batch_size: 1000, error_on_ignore: nil)
  if block_given?
    find_in_batches(start: start, finish: finish, batch_size: batch_size, error_on_ignore: error_on_ignore) do |records|
      records.each { |record| yield record }
    end
  else
    enum_for(:find_each, start: start, finish: finish, batch_size: batch_size, error_on_ignore: error_on_ignore) do
      relation = self
      apply_limits(relation, start, finish).size
    end
  end
end

find_each method accepts an hash with following options.

  • batch_size : Specifies the size of the batch. batch_size is defaulted to 1000
  • start : This is a value of primary key to start records from for the query. Inclusive of the value.
  • finish : This is a value of primary key to finish records from for the query. Inclusive of the value.
  • error_on_ignore : Overrides the application config to specify if an error should be raised when an order is present in the relation.

We can see that find_each internally makes use of find_in_batches which queries database in batches.

Even though we do not pass parameters it makes sure that query is done in batches of size 1000 (default batch_size)

Thus, even though we feel that the table has less number of records, we can make use of find_each as a good practice. find_each will take care of using batches if number of records are greater than batch_size. It also gives additional benefit of not hogging up application memory.

Feel free to comment on with your opinions.