Read more

Ruby < 2.4: Downcasing or upcasing umlauts

Thomas Eisenbarth
January 09, 2012Software engineer at makandra GmbH

Using .downcase or .upcase on strings containing umlauts does not work as expected in Ruby versions before 2.4. It leaves the umlauts unchanged:

"Über".downcase
=> "Über"

"Ärger".downcase
=> "Ärger"
Illustration money motivation

Opscomplete powered by makandra brand

Save money by migrating from AWS to our fully managed hosting in Germany.

  • Trusted by over 100 customers
  • Ready to use with Ruby, Node.js, PHP
  • Proactive management by operations experts
Read more Show archive.org snapshot

The very same applies for french accents (Thanks Guillaume!):

"Être ou ne pas être, telle est la question".downcase
=> "Être ou ne pas être, telle est la question"

Obviously, this leads to problems when comparing strings:

"Über".downcase == "über"
=> false

In Rails you can use ActiveSupports' multibyte chars Show archive.org snapshot to avoid that problem. It gives you a wrapped, correctly encoded version of your string:

"Ärger".mb_chars.downcase
#<ActiveSupport::Multibyte::Chars:0x7f60b1e58dc8 @wrapped_string="ärger">

Comparison works like expected now:

"Über".mb_chars.downcase == "über"
=> true
Posted by Thomas Eisenbarth to makandra dev (2012-01-09 18:33)