If you have an html_safe string, you won't be able to call gsub with a block and match reference variables like $1. They will be nil inside the block where you define replacements (as you already know).
This issue applies to both Rails 2 (with rails_xss) as well as Rails 3 applications.
Here is a fix to SafeBuffer#gsub. Note that it will only fix the $1 behavior, not give you a safe string in the end (see below).
Example
def test(input)
  input.gsub /(f)/ do
    puts $1
    'b'
  end
end
>> test('foo')
f
=> "boo"
>> test('foo'.html_safe)
nil
=> "boo"
Wat? Show archive.org snapshot
More fun: It's not a problem when using inline blocks.
>> 'foo'.html_safe.gsub(/(f)/) { puts $1; 'b' }
f
=> "boo"
Why?
This is because of the way rails_xss implements "unsafe" methods:
# vendor/plugins/rails_xss/lib/rails_xss/string_ext.rb
UNSAFE_STRING_METHODS = [ ..., "gsub", ... ].freeze
for unsafe_method in UNSAFE_STRING_METHODS
  class_eval <<-EOT, __FILE__, __LINE__ + 1
    def #{unsafe_method}(*args)
      super.to_str
    end
    def #{unsafe_method}!(*args)
      raise TypeError, "Cannot modify SafeBuffer in place"
    end
  EOT
end
It is correct to use to_str to force an unsafe string, since using gsub may very well turn a safe string into an unsafe one.
Unfortunately, the implementation also means that the block you pass (with a "gsub do") to a SafeBuffer will not be the block that you pass when doing that on a "normal" String.
What's happening here is that the "string" is matched against your regular expression, which populates the global match object $~. While the block itself will be passed on by the super call, its global match bindings are no longer valid, as they are reset when entering a new block.
How to fix it
Since you can't really expect outside code (read: Gems) to not use $1 (and there is plenty, believe me) when calling gsub on an input that may be a SafeBuffer, you need to fix this behavior yourself.
This works like a charm:
class ActiveSupport::SafeBuffer
  def gsub(*, &block)
    if block_given?
      super do |*other_args|
        Thread.current[:LAST_MATCH_DATA] = $~
        eval("$~ = Thread.current[:LAST_MATCH_DATA]", block.binding)
        block.call(*other_args)
      end
    else
      super.to_str
    end
  end
end
Now we grab the global match object our "outside" block received. We populate it to the inside block's scope so that your replacement logic can again access $1, $2, and all their friends.
The result is not a SafeBuffer but an unsafe String. This is for good reason (and we explicitly call to_str for that), since you could make safe strings unsafe with the right/wrong replacements.