How to fix gsub on SafeBuffer objects

Updated . Posted . Visible to the public.

If you have an html_safe string, you won't be able to call gsub with a block and match reference variables like $1. They will be nil inside the block where you define replacements (as you already know).

This issue applies to both Rails 2 (with rails_xss) as well as Rails 3 applications.

Here is a fix to SafeBuffer#gsub. Note that it will only fix the $1 behavior, not give you a safe string in the end (see below).

Example

def test(input)
  input.gsub /(f)/ do
    puts $1
    'b'
  end
end

>> test('foo')
f
=> "boo"

>> test('foo'.html_safe)
nil
=> "boo"

Wat? Show archive.org snapshot

More fun: It's not a problem when using inline blocks.

>> 'foo'.html_safe.gsub(/(f)/) { puts $1; 'b' }
f
=> "boo"

Why?

This is because of the way rails_xss implements "unsafe" methods:

# vendor/plugins/rails_xss/lib/rails_xss/string_ext.rb

UNSAFE_STRING_METHODS = [ ..., "gsub", ... ].freeze

for unsafe_method in UNSAFE_STRING_METHODS
  class_eval <<-EOT, __FILE__, __LINE__ + 1
    def #{unsafe_method}(*args)
      super.to_str
    end

    def #{unsafe_method}!(*args)
      raise TypeError, "Cannot modify SafeBuffer in place"
    end
  EOT
end

It is correct to use to_str to force an unsafe string, since using gsub may very well turn a safe string into an unsafe one.

Unfortunately, the implementation also means that the block you pass (with a "gsub do") to a SafeBuffer will not be the block that you pass when doing that on a "normal" String.

What's happening here is that the "string" is matched against your regular expression, which populates the global match object $~. While the block itself will be passed on by the super call, its global match bindings are no longer valid, as they are reset when entering a new block.

How to fix it

Since you can't really expect outside code (read: Gems) to not use $1 (and there is plenty, believe me) when calling gsub on an input that may be a SafeBuffer, you need to fix this behavior yourself.

This works like a charm:

class ActiveSupport::SafeBuffer

  def gsub(*, &block)
    if block_given?
      super do |*other_args|
        Thread.current[:LAST_MATCH_DATA] = $~
        eval("$~ = Thread.current[:LAST_MATCH_DATA]", block.binding)
        block.call(*other_args)
      end
    else
      super.to_str
    end
  end

end

Now we grab the global match object our "outside" block received. We populate it to the inside block's scope so that your replacement logic can again access $1, $2, and all their friends.

The result is not a SafeBuffer but an unsafe String. This is for good reason (and we explicitly call to_str for that), since you could make safe strings unsafe with the right/wrong replacements.

Arne Hartherz
Last edit
Jakob Scholz
License
Source code in this card is licensed under the MIT License.
Posted by Arne Hartherz to makandra dev (2012-09-07 14:29)