Posted about 5 years ago. Visible to the public. Repeats.

A common mistake in validations using regular expressions

You certainly use regular expressions for validating strings, e.g. e-mail addresses by saying

validate :email, :with => /.../

Such regular expressions often look something like the following: /^[\w+\-.]+@[a-z\d\-.]+\.[a-z]+$/i which perfectly matches as expected:

>> "".match /^[\w+\-.]+@[a-z\d\-.]+\.[a-z]+$/i => #<MatchData "">

… and does not match unwanted values:

?> "invalid email@invalid".match /^[\w+\-.]+@[a-z\d\-.]+\.[a-z]+$/i => nil

I know that the expression is not sufficient to validate e-mail addresses according to RFC, it's just an example.


By using the expression above you only match (and validate) until the first line break. After the newline, anything is allowed:

?> "\n<script> This is bad... </script>".match /^[\w+\-.]+@[a-z\d\-.]+\.[a-z]+$/i => #<MatchData "">


Use \A to identify the start of the string to match and \z for the end in your validation expression:

?> "\n<script> This is bad... </script>".match /\A[\w+\-.]+@[a-z\d\-.]+\.[a-z]+\z/i => nil

See also Ruby regular expression start/end line vs. start/end string

By refactoring problematic code and creating automated tests, makandra can vastly improve the maintainability of your Rails application.

Author of this card:

Thomas Eisenbarth
Last edit:
over 1 year ago
by Pascal Schmid
About this deck:
We are makandra and do test-driven, agile Ruby on Rails software development.
License for source code
Posted by Thomas Eisenbarth to makandra dev