Ruby's default encodings can be unexpected
Note: This applies to plain Ruby scripts, Rails does not have this issue.
When you work with Ruby strings, those strings will get some default encoding, depending on how they are created. Most strings get the encoding
Encoding.default_internal or UTF-8, if no encoding is set. This is the default and just fine.
However, some strings will instead get
- the string inside a
- some strings created via
- files read from disk
- strings read from an IRB
Encoding.default_external defaults to whatever
locale charmap says on your system. This is usually UTF-8 as well, but can default to something less sane.
If you encounter mysterious encoding errors (like
Encoding::CompatibilityError: incompatible character encodings: ISO-8859-1 and UTF-8) this might be what happened.
You can override this behaviour by manually setting
Encoding.default_external = 'UTF-8'. You should do this at the very beginning of your code.