Regex cheat sheet
On this page
Every regex token you’ll need, with examples and what each matches in JavaScript regex (the tester on this site uses native JS regex). Notes on differences from PCRE / Python where they matter.
For interactive testing, paste any of these patterns into the tester and watch them match.
Literal characters
abc matches "abc" anywhere in the input
To match a metacharacter literally, escape it with \:
\. a literal dot
\* a literal asterisk
\\ a literal backslash
\( a literal opening paren
\/ a literal forward slash (in /pattern/ form)
Character classes
. any character except newline (or any if `s` flag)
\d a digit (0-9) — same as [0-9]
\D a non-digit — same as [^0-9]
\w a word char — [A-Za-z0-9_]
\W a non-word char
\s whitespace — space, tab, \n, \r, \f, \v
\S non-whitespace
[abc] one of a, b, or c
[^abc] anything except a, b, or c
[a-z] range: a through z
[a-zA-Z] multiple ranges
[\d-] digit OR a literal dash (dash at end is literal)
In JavaScript with the u flag, you can match Unicode categories:
\p{Letter} any Unicode letter
\p{Script=Greek} any Greek-script character
\p{Emoji} any emoji
\P{Letter} anything that's not a letter
These don’t work without the u flag, and they’re not supported in
PCRE-style regex without the equivalent flag.
Anchors
^ start of string (or line if `m` flag)
$ end of string (or line if `m` flag)
\b word boundary
\B non-word boundary
\A start of string (some flavors; not JS)
\Z end of string (some flavors; not JS)
In JavaScript, ^ and $ match the very start and end of input by
default. Add the m flag to make them match line boundaries.
Quantifiers
* zero or more (greedy)
+ one or more (greedy)
? zero or one (greedy)
{n} exactly n
{n,} n or more
{n,m} between n and m
*? zero or more (lazy)
+? one or more (lazy)
?? zero or one (lazy)
{n,m}? lazy version of bounded
Greedy matches as much as possible. Lazy matches as little as possible. They’re identical in what they can match, but very different in which match they prefer.
Input: "aaa"
Pattern: "a+" → matches "aaa"
Pattern: "a+?" → matches "a" (smallest still satisfying +)
Input: "<a><b>"
Pattern: "<.*>" → matches "<a><b>" (greedy — gobbles to last >)
Pattern: "<.*?>" → matches "<a>" (lazy — stops at first >)
Groups
(abc) capturing group — referenced by $1, $2, ...
(?:abc) non-capturing group (no backreference slot)
(?<name>abc) named capturing group — referenced by $<name>
\1, \2 backreference to a numbered capture
\k<name> backreference to a named capture
Capturing groups are also “remembered” for replacement strings:
Pattern: (\w+)\s(\w+)
Replacement: $2 $1
Input: John Smith
Output: Smith John
Alternation
cat|dog either "cat" or "dog"
^(yes|no)$ entire string is yes or no
Alternation has low precedence — it splits at the loosest level. Group
with (...) to scope it.
Lookaround
(?=...) positive lookahead — followed by ...
(?!...) negative lookahead — NOT followed by ...
(?<=...) positive lookbehind — preceded by ...
(?<!...) negative lookbehind — NOT preceded by ...
Lookarounds match a position, not characters — they don’t consume input.
\d+(?= dollars) digit run followed by " dollars" (the " dollars" isn't matched)
(?<=\$)\d+ digit run preceded by "$" (the "$" isn't matched)
JavaScript supports all four directions. Older PCRE versions had restrictions on variable-length lookbehind.
Flags
g global — find all matches, not just the first
i case-insensitive
m multiline — ^ and $ match line boundaries
s dotall — . matches newlines
u unicode — full Unicode mode (enables \p{...}, validates surrogates)
y sticky — match at exact lastIndex
d has indices on RegExpExecArray (newer JS)
In JavaScript, you set them after the pattern: /pattern/gimsu. In
other languages, the API differs (Python: re.compile(pat, re.I | re.M),
Go: (?i)pattern, etc.).
Common patterns
\d{3}-\d{2}-\d{4} US Social Security number format
\b[A-Z]{2,3}\b 2-3 letter uppercase acronym
^\s*$ empty / whitespace-only line (with m flag)
\b\w+@\w+\.\w+\b simple email (NOT RFC-compliant)
^https?:// URL starts with http:// or https://
[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12} UUID
^\d{4}-\d{2}-\d{2} ISO date prefix
\b1?\s*[-.]?\(?\d{3}\)?[-.]?\s*\d{3}[-.]?\d{4}\b US phone
Things JavaScript regex does NOT have
If you’re porting from PCRE, .NET, or Python, watch out:
| Feature | PCRE / .NET / Python | JavaScript |
|---|---|---|
Atomic groups (?>...) | yes | no |
Possessive quantifiers a++ | yes | no |
Recursion (?R) | PCRE | no |
Conditional (?(1)yes|no) | PCRE | no |
Unicode categories \p{Letter} | yes | yes (with u flag) |
| Named captures | yes | yes (since ES2018) |
| Variable-length lookbehind | newer PCRE | yes |
\A, \z, \Z anchors | yes | no — use ^ and $ instead |
Inline modifiers (?i)... | yes | no — use flags |
When porting a PCRE pattern to JavaScript, the common adjustments are:
remove atomic groups (rewrite without nested quantifiers), add the u
flag for Unicode safety, and replace \A/\z with ^/$.
Catastrophic backtracking
Some patterns can take exponential time on small input:
(a+)+b matches "aaaaaa!" in seconds (no b at end)
(a|aa)*b similar
.*.*b alternation with greedy quantifier
The fix in JS (no atomic groups available):
- Avoid nested quantifiers when possible.
- Use specific character classes —
\w+instead of.*. - Anchor your pattern.
- For complex grammars, parse with a real parser, not regex.
Tools on this site
- Live tester — paste a pattern, watch matches highlight
- Find and replace — capture-group substitution
- What is regex? — fundamentals
- Regex flavors compared — JS vs PCRE vs Python vs others