Sanitize filename with user input

If you have to build a filename (e.g. for use in downloads) that contains user input, keep in mind that malicious input might come from users and could lead to security problems.

Instead of blacklisting characters such as / , . \ ´ etc. better go for a stricter approach and use a whitelist such as:

def sanitize_filename(filename)
  filename.gsub(/[^0-9A-z.\-]/, '_')
end

If you need to have German Umlauts you can use /[^\w\-]/. This, however, will only work if the locale is set to German.
You might also want to transliterate Umlauts before using gsub, e.g.

I18n.transliterate('Übersetzung')
=> 'Uebersetzung'

with the following Hash in <locale>.yml:

i18n:
  transliterate:
    rule:
      Ä: Ae
      Ö: Oe
      Ü: Ue
      ä: ae
      ö: oe
      ü: ue

This will escape all characters except dashes, numbers and alphanumeric characters to underscores:

sanitize_filename "this -> !#+ <- gets sanitized, honey!"
=> "this_-_______-_gets_sanitized__honey_"
Thomas Eisenbarth Almost 13 years ago