Regex: Be careful when trying to match the start and/or end of a text

Updated . Posted . Visible to the public. Repeats.

Ruby has two different ways to match the start and the end of a text:

  • ^ (Start of line) and $ (End of line)
  • \A (Start of string) and \z (End of string)

Most often you want to use \A and \z.

Here is a short example in which we want to validate the content type of a file attachment. Normally we would not expect content_type_1 to be a valid content type with the used regular expression image\/(jpeg|png). But as ^ and $ will match lines, it matches both content_type_1 and content_type_2. Using \A and \z will work as expected instead and excludes content_type_1.

content_type_1 = "image/jpeg\napplication/javascript"
content_type_2 = "image/jpeg"


# Using `^` and `$`
content_type_1.match(/^image\/(jpeg|png)$/)
# => <MatchData "image/jpeg" 1:"jpeg">
content_type_2.match(/^image\/(jpeg|png)$/)
# => <MatchData "image/jpeg" 1:"jpeg">


# Using `\A` and `\z`
content_type_1.match(/\Aimage\/(jpeg|png)\z/)
# => nil
content_type_2.match(/\Aimage\/(jpeg|png)\z/)
# => <MatchData "image/jpeg" 1:"jpeg">

Rails

Newer Rails explicitly warns you, when you use ^ and $ in validations with a regular expression, as this might be a security risk.

Used validation in the model:

validates_format_of :content_type, :with => /^image\/(jpeg|png)$/

Resulting exception:

The provided regular expression is using multiline anchors (^ or $), which may present a security risk. Did you mean to use \A and \z, or forgot to add the :multiline => true option? (ArgumentError)

You can remove this warning by changing your validation like this (Be sure you really want to):

validates_format_of :content_type, :with => /^image\/(jpeg|png)$/, multiline: true

See also

Thomas Eisenbarth
Last edit
Henning Koch
License
Source code in this card is licensed under the MIT License.
Posted by Thomas Eisenbarth to makandra dev (2013-02-03 10:52)