Posted over 8 years ago. Visible to the public. Linked content.

Regular Expressions - Cheat Sheet

You can write regular expressions some different ways, e.g. /regex/ and %r{regex}. For examples, look here.


Literal Characters
[ ] \ ^ $ . | ? * + ( )

Character Classes
[ae] matches a and e, e.g. gr[ae]y => grey or gray => but NOT graay or graey

[0-9] matches a SINGLE digit in the range from 0 to 9 [0-9a-fA-F] hexadecimal digit ^ negates character class, q[^x] matches qu in question, but NOT Iraq, since there is no character after the q for the negated character class to match

Shorthand Characters
\d matches a single character that is a digit
\w matches a word character (alphanumeric characters plus underscore)
\s matches white space character (includes tabs and line breaks)
\t matches tab character

Non-Printable Characters
\xFF matches hexadecimal character
\uFFFF matches unicode character, \u20AC matches €

. matches all, sometimes except line breaks [^\n] Unix, [^\r\n] Windows


^ matches the start of a line $ matches the end of a line \A matches the start of a string \z matches the end of a string \b matches a word boundary. A word boundary is a position between a character that can be matched by \w and a character that cannot be matched by \w. also matches at the start and/or end of the string if the first and/or last characters in the string are word characters. \B matches at every position where \b cannot match.


cat|dog will match cat in "About cats and dogs", if RegEx is applied again, it will match dog


? none or one, e.g. colou?r matches colour or color * zero or more times + once or more times <[A-Za-z][A-Za-z0-9]*> matches an HTML tag without any attributes. <[A-Za-z0-9]+> is easier to write but matches invalid tags such as <1> Use curly braces to specify a specific amount of repetition. Use \b[1-9][0-9]{3}\b to match a number between 1000 and 9999. \b[1-9][0-9]{2,4}\b matches a number between 100 and 99999.

Modes: Greedy, Lazy and Possessive

Example string: This test is a <EM>first</EM> test string.

  • greedy: default. * and + match as much as they can and backtrack when they can't satisfy the regex, i.e. the .* in /.*test/ will first match the whole example string and then go back to match this: This test is a <EM>first</EM> test .

  • lazy (ungreedy): specified by adding a question mark to the qualifier. *? and +? match as little as possible, i.e. /.*?test/ will match This test.

  • possessive: specified by adding a plus sign to the qualifier. Reads like "greedy without backtracking" – *+ and ++ try to match everything but immediately return if it doesn't succeed, i.e. /\d++/ matches 333 whereas /\d++3/ does not. (A lazy /\d+?/ would only match 3.)
    Use it with caution. Mostly you'll want to use it for small expressions, e.g. for nested sub-regexes.


Look-arounds provide a way to match context-dependant. You can look-behind, look-ahead and to both in a positive and negative way. The look-around will not be part of the match.

  • /foo(?=bar)/ matches the foo in the foo and the bar but not in this food is bad
  • /otto(?!normal)/ matches the otto in ottomotor but not in ottonormalverbraucher
  • /(?<=ma)kandra/ matches the kandra in makandra but not in kandra
  • /(?<!foo)bar/ matches the bar in moo bar but not in foobar

Modifiers in Ruby

Add modifiers after the final slash, e.g. /Regex/im, or at the beginning of the regex, e.g. /(?i)regex/.

  • i: case insensitivity
  • m: make . also match newlines. Know that this modifier does work in Ruby, but not JS or Perl.
  • o: evaluate string interpolation only once (e.g. /foo#{Counter.value}/)
  • x: ignore whitespace (and comments) inside the regex. Allows for definitions like this:

    / < (3)+ # repeating part \ you # need to escape this space! /x

    Any whitespace you could have in regular regexes is eliminated before matching (/( ?= foo) bar/x is the same as /(?=foo)bar/). Hence to match spaces, you need to escape them.

    x has unexpected side effects: /foo +/x matches foo and foofoo, it seems to actually use /(?:foo)+/ for matching. Furthermore, /I sign in as ?/x matches What do you expect with a match result of "" (internal regex is /(?:Isigninas)?/). Obviously the engine eliminates whitespace from left to right and turns resulting substrings into unreferenced groups before applying quantifiers. (This is true for Ruby, could not check it for other languages.)

makandra has been working exclusively with Ruby on Rails since 2007. Our laser focus on a single technology has made us a leader in this space.

Owner of this card:

Martin Straub
Last edit:
almost 6 years ago
RegEx, regexp, multiple, lines, multi-line
About this deck:
We are makandra and do test-driven, agile Ruby on Rails software development.
License for source code
Posted by Martin Straub to makandra dev
This website uses cookies to improve usability and analyze traffic.
Accept or learn more