Posted over 4 years ago. Visible to the public.

Ruby's default encodings can be unexpected

Note: This applies to plain Ruby scripts, Rails does not have this issue.

When you work with Ruby strings, those strings will get some default encoding, depending on how they are created. Most strings get the encoding Encoding.default_internal or UTF-8, if no encoding is set. This is the default and just fine.

However, some strings will instead get Encoding.default_external, notably

  • the string inside a StringIO.new
  • some strings created via CSV
  • files read from disk
  • strings read from an IRB

Encoding.default_external defaults to whatever locale charmap says on your system. This is usually UTF-8 as well, but can default to something less sane.

If you encounter mysterious encoding errors (like Encoding::CompatibilityError: incompatible character encodings: ISO-8859-1 and UTF-8) this might be what happened.

You can override this behaviour by manually setting Encoding.default_external = 'UTF-8'. You should do this at the very beginning of your code.

Growing Rails Applications in Practice
Check out our new e-book:
Learn to structure large Ruby on Rails codebases with the tools you already know and love.

Owner of this card:

Avatar
Tobias Kraze
Last edit:
over 4 years ago
by Tobias Kraze
About this deck:
We are makandra and do test-driven, agile Ruby on Rails software development.
License for source code
Posted by Tobias Kraze to makandra dev
This website uses short-lived cookies to improve usability.
Accept or learn more