How to use html_safe correctly

Updated . Posted . Visible to the public. Repeats.

Back in the war, Rails developers had to manually HTML-escape user-supplied text before it was rendered in a view. If only a single piece of user-supplied text was rendered without prior escaping, it enabled XSS attacks Show archive.org snapshot like injecting a <script> tag into the view of another user.

Because this practice was so error-prone, the rails_xss plugin Show archive.org snapshot was developed and later integrated into Rails 3. rails_xss follows a different approach: Instead of relying on the developer taking great care to escape every single piece of user-supplied text, rails_xss escapes everything by default. If a piece of text is indeed allowed to be rendered without escaping (e.g. the results of a helper method that builds some HTML for us), the developer needs to explicitely mark this string as "safe".

Unfortunately, while rails_xss is a good solution to preventing XSS, it is easily misused, leaving Rails applications open to the exact same XSS attacks rails_xss was built to defend against. A plethora of bad examples on the internet have enforced common misconceptions about how this mechanism works. In particular, a popular anti-pattern is to randomly pepper helper methods with calls to html_safe whenever you see escaped HTML tags. This is wrong and enables XSS in your application.

If you have ever done this, or if you somehow think that html_safe is the "reverse" of the old h(...) helper we used to escape input with, you need to read this card and learn more about how html_safe actually works.

How html_safe works

Calling html_safe on a String returns a new object that looks and acts like a String, but actually is a ActiveSupport::SafeBuffer:

"foo".length
# => 3
"foo".class
# => String

"foo".html_safe.length
# => 3
"foo".html_safe.class
# => ActiveSupport::SafeBuffer

The behavior of SafeBuffer differs from a String in one way only: When you append a String to a SafeBuffer (by calling + or <<), that other String is HTML-escaped before it is appended to the SafeBuffer:

"<foo>".html_safe + "<bar>"
# => "<foo>&lt;bar&gt;"

When you append another SafeBuffer to a SafeBuffer, no escaping will occur:

"<foo>".html_safe + "<bar>".html_safe
# => "<foo><bar>"

Note how calling html_safe on a String doesn't escape or unescape the String itself. It doesn't change the string at all. All it does is return a SafeBuffer which will handle future concatenations differently than a String.

How Rails auto-escapes in views

Rails renders your views into a SafeBuffer. It starts with an empty SafeBuffer and one by one appends the components of your views to it. This means that any <%= expression %> in your view template will be HTML-escaped, unless the expression returns a SafeBuffer, which does not need to be escaped.

Take the following ERB template:

<p>
  <%= '<br />' %>
  <%= '<br />'.html_safe %>
</p>

Somewhere inside of Rails, this ERB template will be converted into a Ruby expression like this:

html = ''.html_safe
html << '<p>'.html_safe
html << '<br />'
html << '<br />'.html_safe
html << '</p>'.html_safe
html

If we eval the expression above, we will get this result:

<p>
  &lt;br /&gt;
  <br />
</p>

Common misuse example and how to fix it

This section describes a common example how html_safe can be misused and leave your site vulnerable to XSS attacks.

Let's say we want to write a helper method which takes some content and wraps it into a <div> tag with the class "group". Our first implementation might look like this:

def group(content)
  "<div class='group'>#{content}</div>"
end

We run tests and realize that our helper escapes too much. Content appears like this:

&lt;div class='group'&gt;content&lt;/div&gt;

A common mistake is to see those escaped angle brackets, and "improve" the helper by making everything html_safe:

def group(content)
  "<div class='group'>#{content}</div>".html_safe
end

We have just created a helper that vouches for its return value to be html_safe. By extension, it vouches for content to be safe, when it actually does not know anything about content. If content is unsafe user input, it will be rendered unescaped:

<div class="group"><script>alert('pwned!')</script></div>

What we actually want to do is to escape content if it is unsafe, but leave it unescaped if it is safe. To achieve this we can simply use SafeBuffer's concatenation behavior:

 def group(content)
  html = "".html_safe
  html << "<div class='group'>".html_safe
  html << content
  html << "</div>".html_safe
  html
end

Our helper still returns a safe string, but correctly escapes content if it is unsafe. Note how much more flexible our group helper has become because it now works as expected with both safe and unsafe arguments. We can now leave it up to the caller whether to mark input as safe or not, and we no longer need to make any assumptions about the safeness of content.

Note how built-in Rails helpers like link_to or content_tag also generate HTML tags around content that might or might not be user input. These helpers also work like the last implementation of group above, so we could refactor our helper to this:

def group(content)
  content_tag(:div, content, class: 'group')
end
Henning Koch
Last edit
Michael Leimstädtner
License
Source code in this card is licensed under the MIT License.
Posted by Henning Koch to makandra dev (2011-10-25 13:21)