Read more

Ruby < 2.4: Downcasing or upcasing umlauts

Thomas Eisenbarth
January 09, 2012Software engineer at makandra GmbH

Using .downcase or .upcase on strings containing umlauts does not work as expected in Ruby versions before 2.4. It leaves the umlauts unchanged:

"Über".downcase
=> "Über"

"Ärger".downcase
=> "Ärger"
Illustration web development

Do you need DevOps-experts?

Your development team has a full backlog? No time for infrastructure architecture? Our DevOps team is ready to support you!

  • We build reliable cloud solutions with Infrastructure as code
  • We are experts in security, Linux and databases
  • We support your dev team to perform
Read more Show archive.org snapshot

The very same applies for french accents (Thanks Guillaume!):

"Être ou ne pas être, telle est la question".downcase
=> "Être ou ne pas être, telle est la question"

Obviously, this leads to problems when comparing strings:

"Über".downcase == "über"
=> false

In Rails you can use ActiveSupports' multibyte chars Show archive.org snapshot to avoid that problem. It gives you a wrapped, correctly encoded version of your string:

"Ärger".mb_chars.downcase
#<ActiveSupport::Multibyte::Chars:0x7f60b1e58dc8 @wrapped_string="ärger">

Comparison works like expected now:

"Über".mb_chars.downcase == "über"
=> true
Posted by Thomas Eisenbarth to makandra dev (2012-01-09 18:33)