Posted over 4 years ago. Visible to the public. Repeats.

A common mistake in validations using regular expressions

You certainly use regular expressions for validating strings, e.g. e-mail addresses by saying

Copy
validate :email, :with => /.../

Such regular expressions often look something like the following: /^[\w+\-.]+@[a-z\d\-.]+\.[a-z]+$/i which perfectly matches as expected:

Copy
>> "foo.bar-ooops@bar.com".match /^[\w+\-.]+@[a-z\d\-.]+\.[a-z]+$/i => #<MatchData "foo.bar-ooops@bar.com">

… and does not match unwanted values:

Copy
?> "invalid email@invalid host.com".match /^[\w+\-.]+@[a-z\d\-.]+\.[a-z]+$/i => nil

I know that the expression is not sufficient to validate e-mail addresses according to RFC, it's just an example.

Problem

By using the expression above you only match (and validate) until the first line break. After the newline, anything is allowed:

Copy
?> "foo.bar-ooops@bar.com\n<script> This is bad... </script>".match /^[\w+\-.]+@[a-z\d\-.]+\.[a-z]+$/i => #<MatchData "foo.bar-ooops@bar.com">

Solution

Use \A to identify the start of the string to match and \z for the end in your validation expression:

Copy
?> "foo.bar-ooops@bar.com\n<script> This is bad... </script>".match /\A[\w+\-.]+@[a-z\d\-.]+\.[a-z]+\z/i => nil

See also Ruby regular expression start/end line vs. start/end string

Does your version of Ruby on Rails still receive security updates?
Rails LTS provides security patches for old versions of Ruby on Rails (3.2 and 2.3).

Author of this card:

Avatar
Thomas Eisenbarth
Last edit:
11 months ago
by Pascal Schmid
About this deck:
We are makandra and do test-driven, agile Ruby on Rails software development.
License for source code
Posted by Thomas Eisenbarth to makandropedia