What PDFKit is
- PDFKit Show archive.org snapshot converts a web page to a PDF document. It uses a Webkit engine under the hood.
- For you as a web developer this means you can keep using the technology you are familar with and don't need to learn LaTeX. All you need is a pretty print-stylesheet.
How to use it from your Rails application
- You can have PDFKit render a website by simply calling
PDFKit.new('http://google.com').to_file('google.pdf')
. You can then send the PDF usingsend_file 'google.pdf'
. - You can use your controller to render the body of your PDF to a string and pass it to
PDFKit.new
. Paths to the stylesheets must be passed separately before callingto_file
. - Alternatively you can use
PDFKit::Middleware
and all your Rails routes automagically respond to the.pdf
format. This is awesome to get started fast, but details like setting the content disposition (download / inline) or download filename is awkward.
Configure PDFKit in an initializer:
PDFKit.configure do |config|
config.default_options = {
# print_media_type: true,
# page_size: 'A4',
# margin_top: '2cm',
# margin_right: '2cm',
# margin_left: '2cm',
# margin_bottom: '2cm',
quiet: true, # No output during PDF generation
load_error_handling: 'abort', # Crash early
load_media_error_handling: 'abort', # Crash early
no_outline: true, # Disable the default outline
# disable_smart_shrinking: true, # Enable to keep the pixel/dpi ratio linear
}
config.wkhtmltopdf = Rails.root.join('vendor/wkhtmltopdf/linux-trusty-amd64/wkhtmltopdf').to_s
end
Most options are forwarded to wkhtmltopdf
(see below). You can get a list of supported options by running man wkhtmltopdf
. However, you should always have quiet: true
to keep your test output and logs clean.
How to express page breaks, headers, footers, etc.
There are concepts and formattings that only make sense on paper, so the question is how to implement them if you only have CSS:
- CSS actually has a few print-related directives, e.g. for controlling page breaks:
page-break-before:always; page-break-after:always; page-break-inside
- PDFKit also comes with some custom options that are hard to express in CSS (or are not supported by the Webkit engine that PDFKit internally uses). These are things like:
- Paper form
- Print margins
- Repeating header on every page
- Repeating footer on every page
- You can actually execute JavaScript before the page is rendered to PDF, and implement things like page numbers in the header or footer.
Headers and Footers
- To add your repeated header and footer files, add or modify these attributes in your PDFKit options hash:
{
header_html: 'app/views/foo/bar/header.html',
footer_html: 'app/views/foo/bar/footer.html',
margin_top: '200px', # Height of the header, can be px or mm
margin_bottom: '150px', # Height of the footer, can be px or mm
header_spacing: 52.197, # margin_top converted from px => mm (You can see it as "margin-top: -200px")
footer_spacing: 0, # The top left edge of the footer is already at the right position
replace: { # You can pass custom data to your JavaScript this way
custom: {
:foo => 'bar',
}.to_json
}
-
Both files are independent DOM trees and share nothing. This means that styles, fonts and scripts must be included in each of these files.
-
I ended up in-lining
style
andscript
tags, since relative paths do not work when parsed from the wkhtmltopdf binary. With something like%style= Rails.application.assets.find_asset('pdf.sass').body.html_safe
in your layout you still split the CSS in two files. -
Use the following JavaScript function to access variables such as the page number or the custom JSON encoded hash:
function allQueryInformation() {
var pdfInfo = {};
var queryStrings = document.location.search.substring(1).split('&');
for (var query in queryStrings) {
if (queryStrings.hasOwnProperty(query)) {
var keyValuePair = queryStrings[query].split('=', 2);
var key = keyValuePair[0];
var value = keyValuePair[1];
pdfInfo[key] = decodeURI(value);
}
}
return pdfInfo;
}
- If you want to place repeated content outside the header or footer area, I found that this is only possible via absolute positioning in the header, but not the footer.
Fonts and their rendering quality
- The font rendering quality of PDFKit used to be really, really horrible when compared to e.g. saving a page as PDF from a Chrome Browser. Horrible kerning, distorted characters, bad support for web fonts, etc.
- PDFKit has improved a lot here. Their rendering quality is now fine in recent versions of
wkhtmltopdf
(0.12+). - You will never beat LaTeX if you need perfect font rendering.
- If you are observing strange behavior when including your fonts, this card might help
Understand the wkhtmltopdf binary
PDFKit is only a thin wrapper around the wkhtmltopdf
binary. Unfortunately old versions wkhtmltopdf
have many, many issues and your package sources don't usually come with a recent version. You should have at least 0.12.1, which you may obtain
from here
Show archive.org snapshot
. Bundle it with your application and tell PDFKit where to find the bundled binary like so:
PDFKit.configure do |config|
config.wkhtmltopdf = "#{Rails.root}/vendor/wkhtmltopdf/linux-precise-amd64/wkhtmltopdf"
end
When using version 0.12.6
and above, you'll need to add the following command line switch to your PDFKit
configuration to avoid crashes with cryptic error messages like PDFKit::ImproperWkhtmltopdfExitStatus
:
PDFKit.configure do |config|
config.default_options = {
...,
enable_local_file_access: true,
}
end
Deadlock issues on development machine ("PDFKit middleware hangs")
When using the PDFKit middleware on your development, you might experience that your application "locks up" whenever you request a .pdf
route.
This behavior is caused by a deadlock:
- The Rails process is trying to render the page to PDF
- To render the PDF additional assets (CSS, images, Javascripts) are required
- When using a singlethreaded development server like Thin there is no additional worker process available to deliver those assets.
The easiest fix for this is to use Passenger Standalone for development, which can spawn multiple worker processes. However, Passenger does not allow to use debugger
or byebug
.
If you don't want to use Passenger you can also do this:
- Switch from Thin to Webrick
- In
config/environments/development.rb
setconfig.allow_concurrency = true
(default in Rails 4)
Note that this allows concurrent requests served from the same process using threads. This might cause unexpected behavior if your application or dependencies are not thread-safe. If you don't know what that means, your application probably isn't thread-safe.
Caveats when implementing pixel-perfect layouts
- PDFKit sets a default vertical margin of 0.75 inch which disables the automatic header/footer calculation from wkhtmltopdf. This margin was impossible to unset in some versions of PDFKit
- If the rendered PDF document doesn't have a doctype, some versions of wkhtmltopdf won't render the header
- A white border is drawn around the header and footer, which you might want to reset