Sanitize filename with user input

Updated . Posted . Visible to the public.

If you have to build a filename (e.g. for use in downloads) that contains user input, keep in mind that malicious input might come from users and could lead to security problems.

Instead of blacklisting characters such as / , . \ ´ etc. better go for a stricter approach and use a whitelist such as:

def sanitize_filename(filename)
  filename.gsub(/[^0-9A-z.\-]/, '_')
end

If you need to have German Umlauts you can use /[^\w\-]/. This, however, will only work if the locale is set to German.
You might also want to transliterate Umlauts before using gsub, e.g.

I18n.transliterate('Übersetzung')
=> 'Uebersetzung'

with the following Hash in <locale>.yml:

i18n:
  transliterate:
    rule:
      Ä: Ae
      Ö: Oe
      Ü: Ue
      ä: ae
      ö: oe
      ü: ue

This will escape all characters except dashes, numbers and alphanumeric characters to underscores:

sanitize_filename "this -> !#+ <- gets sanitized, honey!"
=> "this_-_______-_gets_sanitized__honey_"
Thomas Eisenbarth
Last edit
Daniel Straßner
Keywords
safe, secure, string
License
Source code in this card is licensed under the MIT License.
Posted by Thomas Eisenbarth to makandra dev (2011-06-28 13:21)