Unicode character class escape: Match text by Unicode properties

Posted . Visible to the public.

The linked MDN article is quite informative of a neat feature supported by all major browsers Show archive.org snapshot : Unicode character class escape.

You can use it to write regular expressions that work on the full UTF-8 space, not just Latin/ASCII. For example, a password policy matcher might include regular expressions like [A-z] or [0-9], but those do not match e.g. German umlauts Show archive.org snapshot or Eastern Arabic Numerals Show archive.org snapshot . Those examples can easily be replaced with /\p{Letter}/u and \p{Number}. The expression /p supports various modifiers and shorthands.

Example password policy checker with Unicode character class escape

const password = 'Äö١!'

const upper = /\p{Lu}/u.test(password)
const lower = /\p{Ll}/u.test(password)
const digit = /\p{N}/u.test(password)
const symbol = /[^\p{Lu}\p{Ll}\p{N}]/u.test(password)

const matchedCategories = [upper, lower, digit, symbol].filter(Boolean).length // 4
Michael Leimstädtner
Last edit
Michael Leimstädtner
License
Source code in this card is licensed under the MIT License.
Posted by Michael Leimstädtner to makandra dev (2025-08-28 08:27)