Ruby < 2.4: Downcasing or upcasing umlauts

Posted Over 12 years ago. Visible to the public.

Using .downcase or .upcase on strings containing umlauts does not work as expected in Ruby versions before 2.4. It leaves the umlauts unchanged:

"Über".downcase
=> "Über"

"Ärger".downcase
=> "Ärger"

The very same applies for french accents (Thanks Guillaume!):

"Être ou ne pas être, telle est la question".downcase
=> "Être ou ne pas être, telle est la question"

Obviously, this leads to problems when comparing strings:

"Über".downcase == "über"
=> false

In Rails you can use ActiveSupports' multibyte chars Show archive.org snapshot to avoid that problem. It gives you a wrapped, correctly encoded version of your string:

"Ärger".mb_chars.downcase
#<ActiveSupport::Multibyte::Chars:0x7f60b1e58dc8 @wrapped_string="ärger">

Comparison works like expected now:

"Über".mb_chars.downcase == "über"
=> true
Thomas Eisenbarth
Last edit
Almost 6 years ago
Henning Koch
License
Source code in this card is licensed under the MIT License.
Posted by Thomas Eisenbarth to makandra dev (2012-01-09 17:33)