Back in the war, Rails developers had to manually HTML-escape user-supplied text before it was rendered in a view. If only a single piece of user-supplied text was rendered without prior escaping, it enabled
XSS attacks
Show archive.org snapshot
like injecting a <script>
tag into the view of another user.
Because this practice was so error-prone, the
rails_xss plugin
Show archive.org snapshot
was developed and later integrated into Rails 3. rails_xss
follows a different approach: Instead of relying on the developer taking great care to escape every single piece of user-supplied text, rails_xss
escapes everything by default. If a piece of text is indeed allowed to be rendered without escaping (e.g. the results of a helper method that builds some HTML for us), the developer needs to explicitely mark this string as "safe".
Unfortunately, while rails_xss
is a good solution to preventing XSS, it is easily misused, leaving Rails applications open to the exact same XSS attacks rails_xss
was built to defend against. A plethora of bad examples on the internet have enforced common misconceptions about how this mechanism works. In particular, a popular anti-pattern is to randomly pepper helper methods with calls to html_safe
whenever you see escaped HTML tags. This is wrong and enables XSS in your application.
If you have ever done this, or if you somehow think that html_safe
is the "reverse" of the old h(...)
helper we used to escape input with, you need to read this card and learn more about how html_safe
actually works.
How html_safe works
Calling html_safe
on a String
returns a new object that looks and acts like a String
, but actually is a ActiveSupport::SafeBuffer
:
"foo".length
# => 3
"foo".class
# => String
"foo".html_safe.length
# => 3
"foo".html_safe.class
# => ActiveSupport::SafeBuffer
The behavior of SafeBuffer
differs from a String
in one way only: When you append a String
to a SafeBuffer
(by calling +
or <<
), that other String
is HTML-escaped before it is appended to the SafeBuffer
:
"<foo>".html_safe + "<bar>"
# => "<foo><bar>"
When you append another SafeBuffer
to a SafeBuffer
, no escaping will occur:
"<foo>".html_safe + "<bar>".html_safe
# => "<foo><bar>"
Note how calling html_safe
on a String
doesn't escape or unescape the String
itself. It doesn't change the string at all. All it does is return a SafeBuffer
which will handle future concatenations differently than a String
.
How Rails auto-escapes in views
Rails renders your views into a SafeBuffer
. It starts with an empty SafeBuffer
and one by one appends the components of your views to it. This means that any <%= expression %>
in your view template will be HTML-escaped, unless the expression returns a SafeBuffer
, which does not need to be escaped.
Take the following ERB template:
<p>
<%= '<br />' %>
<%= '<br />'.html_safe %>
</p>
Somewhere inside of Rails, this ERB template will be converted into a Ruby expression like this:
html = ''.html_safe
html << '<p>'.html_safe
html << '<br />'
html << '<br />'.html_safe
html << '</p>'.html_safe
html
If we eval the expression above, we will get this result:
<p>
<br />
<br />
</p>
Common misuse example and how to fix it
This section describes a common example how html_safe
can be misused and leave your site vulnerable to XSS attacks.
Let's say we want to write a helper method which takes some content and wraps it into a <div>
tag with the class "group". Our first implementation might look like this:
def group(content)
"<div class='group'>#{content}</div>"
end
We run tests and realize that our helper escapes too much. Content appears like this:
<div class='group'>content</div>
A common mistake is to see those escaped angle brackets, and "improve" the helper by making everything html_safe
:
def group(content)
"<div class='group'>#{content}</div>".html_safe
end
We have just created a helper that vouches for its return value to be html_safe
. By extension, it vouches for content
to be safe, when it actually does not know anything about content
. If content
is unsafe user input, it will be rendered unescaped:
<div class="group"><script>alert('pwned!')</script></div>
What we actually want to do is to escape content
if it is unsafe, but leave it unescaped if it is safe. To achieve this we can simply use SafeBuffer
's concatenation behavior:
def group(content)
html = "".html_safe
html << "<div class='group'>".html_safe
html << content
html << "</div>".html_safe
html
end
Our helper still returns a safe string, but correctly escapes content
if it is unsafe. Note how much more flexible our group
helper has become because it now works as expected with both safe and unsafe arguments. We can now leave it up to the caller whether to mark input as safe or not, and we no longer need to make any assumptions about the safeness of content
.
Note how built-in Rails helpers like link_to
or content_tag
also generate HTML tags around content that might or might not be user input. These helpers also work like the last implementation of group
above, so we could refactor our helper to this:
def group(content)
content_tag(:div, content, class: 'group')
end