Skip to content
100% in your browser. Nothing you paste is uploaded — all processing runs locally. Read more →

What is a regular expression?

On this page
  1. In one example
  2. What it’s good at
  3. What it’s bad at
  4. Where it came from
  5. Flavors
  6. A vocabulary primer
  7. Common gotchas
  8. When to reach for something else
  9. Tools on this site
  10. Related across the network

A regular expression (regex, regexp) is a small language for describing text patterns. You write a pattern; a regex engine matches it against input strings. Used for searching, validating, extracting, and transforming text.

If you’ve ever used Find/Replace in VS Code, written a grep command, or validated a form field client-side, you’ve probably touched regex.

In one example

Pattern:  \b\d{3}-\d{2}-\d{4}\b
Input:    "Call me at 555-12-3456 about the report."
Match:    "555-12-3456" (a US SSN-format number)

The pattern is a tiny program: “find a word boundary, then 3 digits, then a dash, then 2 digits, then a dash, then 4 digits, then a word boundary.” The regex engine compiles that to a state machine and runs it against the input.

What it’s good at

What it’s bad at

A useful heuristic: if the input has matched delimiters (parens, brackets, tags) at arbitrary depth, you need a parser, not regex.

Where it came from

Regex traces back to Stephen Kleene’s 1956 paper on regular sets in formal language theory. The implementation we use today started with Ken Thompson’s 1968 QED editor and grep — Thompson built a pattern engine that became the foundation of every Unix tool’s regex support.

The “regex” you write in 2026 is several generations descended from that original. It includes features (lookaround, backreferences, recursion in some flavors) that aren’t formally “regular” in the language-theory sense, but the name stuck.

Flavors

There’s no single regex spec. Each language and tool implements its own flavor, with variations in:

The major flavors:

FlavorUsed inNotable features
POSIX BREolder grep, sed -E (with --basic)Most basic; (, ), {, } don’t need escapes in BRE
POSIX EREmodern grep -E, awkAdds +, ?, |, parens and braces
PCREgrep -P, PHP, many editorsThe “kitchen sink” — recursion, atomic groups, conditionals
JavaScriptbrowsers, NodeECMAScript spec; close to PCRE but missing atomic groups and recursion
Python rePythonPCRE-ish; named groups, but no recursion
JavaJDKPCRE-style with explicit anchor variants (\A, \Z)
Go regexpGoRE2 engine — guaranteed linear time, no backreferences
.NETC#, F#Variable-length lookbehind, balancing groups

The tester on this site uses JavaScript flavor. See Regex flavors compared for a detailed table.

A vocabulary primer

You’ll hear these terms:

Common gotchas

When to reach for something else

If you find yourself writing a regex with:

…consider whether a parser, a dedicated library, or a different approach is cleaner. Email validation in particular: the canonical “email regex” is several thousand characters and still doesn’t cover every legal address. Most production code accepts anything with an @ and lets the SMTP server confirm.

Tools on this site