Regex: Be careful when trying to match the start and/or end of a text

Updated . Posted . Visible to the public. Repeats.

Ruby has two different ways to match the start and the end of a text:

  • ^ (Start of line) and $ (End of line)
  • \A (Start of string) and \z (End of string)

Most often you want to use \A and \z.

Here is a short example in which we want to validate the content type of a file attachment. Normally we would not expect content_type_1 to be a valid content type with the used regular expression image\/(jpeg|png). But as ^ and $ will match lines, it matches both content_type_1 and content_type_2. Using \A and \z will work as expected instead and excludes content_type_1.

content_type_1 = "image/jpeg\napplication/javascript"
content_type_2 = "image/jpeg"


# Using `^` and `$`
content_type_1.match(/^image\/(jpeg|png)$/)
# => <MatchData "image/jpeg" 1:"jpeg">
content_type_2.match(/^image\/(jpeg|png)$/)
# => <MatchData "image/jpeg" 1:"jpeg">


# Using `\A` and `\z`
content_type_1.match(/\Aimage\/(jpeg|png)\z/)
# => nil
content_type_2.match(/\Aimage\/(jpeg|png)\z/)
# => <MatchData "image/jpeg" 1:"jpeg">

Rails

Newer Rails explicitly warns you, when you use ^ and $ in validations with a regular expression, as this might be a security risk.

Used validation in the model:

validates_format_of :content_type, :with => /^image\/(jpeg|png)$/

Resulting exception:

The provided regular expression is using multiline anchors (^ or $), which may present a security risk. Did you mean to use \A and \z, or forgot to add the :multiline => true option? (ArgumentError)

You can remove this warning by changing your validation like this (Be sure you really want to):

validates_format_of :content_type, :with => /^image\/(jpeg|png)$/, multiline: true

See also

Profile picture of Thomas Eisenbarth
Thomas Eisenbarth
Last edit
Michael Leimstädtner
License
Source code in this card is licensed under the MIT License.
Posted by Thomas Eisenbarth to makandra dev (2013-02-03 10:52)