Carrierwave: Using a nested directory structure for file system performance

Updated . Posted . Visible to the public.

When storing files for lots of records in the server's file system, Carrierwave's default store_dir approach may cause issues, because some directories will hold too many entries.

The default storage directory from the Carrierwave templates looks like so:

class ExampleUploader < CarrierWave::Uploader::Base
  def store_dir
    "uploads/#{model.class.to_s.underscore}/#{mounted_as}/#{model.id}"
  end
end

If you store files for 500k records, that store_dir's parent directory will have 500k sub-directories which will cause some serious headaches when trying to navigate the file system, e.g. via ls or rsync.

Here is a simple solution that scales for a long while.

Solution

A simple, proven solution has been to split model.id into chunks. If you are using secrets in your directory structure, this is applicable as well.

Note that root below is the configured storage root. See our card on a suggested configuration for more information.

class ExampleUploader < CarrierWave::Uploader::Base
  def store_dir
    File.join(
      root,
      model.class.model_name.collection,
      mounted_as,
      split_id_path(model),
      secret_folder(model)
    ).to_s
  end

  def split_id_path(model)
    padded_id = model.id.to_s.rjust(6, '0')
    padded_id.split(/(\d\d\d)$/).join('/')
  end

  def secret_folder(model)
    # if you use secret folders, do your magic here
  end
end

Example structure

The resulting directory structure will be:

  • /app-root/public/system/users/avatar/000/001/... (1st record)
  • /app-root/public/system/users/avatar/000/002/... (2nd record)
  • ...
  • /app-root/public/system/users/avatar/000/999/... (999th record)
  • /app-root/public/system/users/avatar/001/000/... (1000th record)
  • ...
  • /app-root/public/system/users/avatar/999/999/... (999'999th record)
  • /app-root/public/system/users/avatar/1000/000/... (1 millionth record)

So if you have 500k records, you will still only have 500 directories inside /app-root/public/users/avatar/. And inside each of them, at most 1000 sub-directories.

But I have millions of files

If you expect to store a lot more records, simply introduce a third level (.../123/456/789/...).

  def split_id_path(model)
    padded_id = model.id.to_s.rjust(9, '0')
    padded_id.split(/(\d\d\d)$/).join('/')
  end

See also

Arne Hartherz
Last edit
Arne Hartherz
License
Source code in this card is licensed under the MIT License.
Posted by Arne Hartherz to makandra dev (2021-04-26 07:20)