Posted 11 days ago. Visible to the public.

Generating and streaming ZIP archives on the fly

When your Rails application offers downloading a bunch of files as ZIP archive, you basically have two options:

  1. Write a ZIP file to disk and send it as a download to the user.
  2. Generate a ZIP archive on the fly while streaming it in chunks to the user.

This card is about option 2, and it is actually fairly easy to set up.

We are using this to generate ZIP archives with lots of files (500k+) on the fly, and it works like a charm.

Why stream downloads?

Offering downloads of large archives can be cumbersome:

  • It takes time to build the ZIP file. Users can not start their download immediately.
  • Or you have to prepare it up front, and can only offer download links once that has happened.
  • For one-off downloads, generating a large file is unnecessary, and you have to take care of removing it once it is no longer required.

When you generate them on the fly, downloads start immediately, and you can add contents while the user is downloading.

How to do it

The excellent zip_tricks Archive gem helps you out, and is not too difficult to integrate.
If you want to send only files from ActiveStorage or Carrierwave attachments, it's even simpler when using zipline Archive .

Instructions for ZipTricks

You want to use ZipTricks if you need some level of control, or if you are building some of the ZIP contents on the fly (like some CSV).

  1. Add zip_tricks to your Gemfile and bundle install.

  2. In your controller, include ZipTricks::RailsStreaming.

  3. You can then use the zip_tricks_stream method in controller actions to generated your contents, as described in the documentation.

  4. Make sure you disable Rails middlewares that generate ETag headers for you (like Rails' Rack::ETag Archive or Rack::SteadyETag Archive ).

    • Those middlewares usually want to capture the entire response body in order to generate the ETag based on its contents. If they do that, you won't be streaming to the user.
    • Usually, you can just set your own Last-Modified or ETag header to achieve that.

Example controller:

Copy
class DocumentsController < ApplicationController include ZipTricks::RailsStreaming def download documents = Document.all fresh_when Time.current # Sets `Last-Modified` header (see above). Or say "fresh_when documents" to use the scope. send_file_headers! filename: 'all_documents.zip' # Sets `Content-Disposition` and `Content-Type` headers. zip_tricks_stream do |zip| documents.find_each do |document| filename = document.filename path = document.file.path zip.write_stored_file(filename) do |stream| File.open(path, 'rb') do |source| IO.copy_stream(source, stream) end end end end end end

What happens here is:

  1. zip_tricks_stream generates a buffer object (zip) can be written to from inside the block.
  2. The controller action completes.
  3. Your code inside the block given to zip_tricks_stream runs and writes to the buffer object.
  4. ZipTricks streams the contents to the user.

About compression:

  • Use write_stored_file for files that are large or unlikely to compress significantly (like PNG, JPEG, MP4, ...)
  • Use write_deflated_file when adding files that compress well, like CSV, XML, or other text documents.

Instructions for Zipline

Zipline uses ZipTricks under the hood and is specifically intended for streaming existing file attachments (from ActiveStorage, Carrierwave, etc.).

  1. Add zipline to your Gemfile and bundle install.
  2. In your controller, include Zipline.
  3. You can then use the zipline method in controller actions as described in the gem's documentation.

Note that Zipline will set a Last-Modified header already, disabling any etagging middlewares and allowing streaming.

Example controller:

Copy
class DocumentsController < ApplicationController include Zipline def download documents = Document.all files_and_filenames = documents.find_each.lazy.map do |document| [document.file, document.filename] end zipline(files_and_filenames, 'all_documents.zip') end end

Take care to use lazy.map instead of map, or your controller action has to iterate the entire collection before it can start streaming.

Your development team has a full backlog of feature requests, chores and refactoring coupled with deadlines? We are familiar with that. With our "DevOps as a Service" offering, we support developer teams with infrastructure and operations expertise.

Owner of this card:

Avatar
Arne Hartherz
Last edit:
10 days ago
by Arne Hartherz
About this deck:
We are makandra and do test-driven, agile Ruby on Rails software development.
License for source code
Posted by Arne Hartherz to makandra dev
This website uses short-lived cookies to improve usability.
Accept or learn more