If you have an html_safe
string, you won't be able to call gsub
with a block and match reference variables like $1
. They will be nil
inside the block where you define replacements (as you already know).
This issue applies to both Rails 2 (with rails_xss
) as well as Rails 3 applications.
Here is a fix to SafeBuffer#gsub
. Note that it will only fix the $1
behavior, not give you a safe string in the end (see below).
Example
def test(input)
input.gsub /(f)/ do
puts $1
'b'
end
end
>> test('foo')
f
=> "boo"
>> test('foo'.html_safe)
nil
=> "boo"
Wat? Show archive.org snapshot
More fun: It's not a problem when using inline blocks.
>> 'foo'.html_safe.gsub(/(f)/) { puts $1; 'b' }
f
=> "boo"
Why?
This is because of the way rails_xss
implements "unsafe" methods:
# vendor/plugins/rails_xss/lib/rails_xss/string_ext.rb
UNSAFE_STRING_METHODS = [ ..., "gsub", ... ].freeze
for unsafe_method in UNSAFE_STRING_METHODS
class_eval <<-EOT, __FILE__, __LINE__ + 1
def #{unsafe_method}(*args)
super.to_str
end
def #{unsafe_method}!(*args)
raise TypeError, "Cannot modify SafeBuffer in place"
end
EOT
end
It is correct to use to_str
to force an unsafe string, since using gsub
may very well turn a safe string into an unsafe one.
Unfortunately, the implementation also means that the block you pass (with a "gsub do
") to a SafeBuffer
will not be the block that you pass when doing that on a "normal" String
.
What's happening here is that the "string" is matched against your regular expression, which populates the global match object $~
. While the block itself will be passed on by the super
call, its global match bindings are no longer valid, as they are reset when entering a new block.
How to fix it
Since you can't really expect outside code (read: Gems) to not use $1 (and there is plenty, believe me) when calling gsub
on an input that may be a SafeBuffer
, you need to fix this behavior yourself.
This works like a charm:
class ActiveSupport::SafeBuffer
def gsub(*, &block)
if block_given?
super do |*other_args|
Thread.current[:LAST_MATCH_DATA] = $~
eval("$~ = Thread.current[:LAST_MATCH_DATA]", block.binding)
block.call(*other_args)
end
else
super.to_str
end
end
end
Now we grab the global match object our "outside" block received. We populate it to the inside block's scope so that your replacement logic can again access $1
, $2
, and all their friends.
The result is not a SafeBuffer
but an unsafe String
. This is for good reason (and we explicitly call to_str
for that), since you could make safe strings unsafe with the right/wrong replacements.