Regular Expressions: The Complete Guide
Regular expressions (regex) are one of the most powerful tools in a developer’s arsenal. They let you search, match, validate, and transform text using patterns. Once you learn regex, you’ll use it everywhere — from form validation to log parsing to code refactoring.
What Is a Regular Expression?
A regex is a sequence of characters that defines a search pattern. Think of it as a highly advanced “find and replace” feature:
Pattern: \d{3}-\d{4}
Matches: 555-1234, 800-9999
The pattern above matches a 3-digit number, a hyphen, and a 4-digit number.
Practice along: Open our Regex Tester in another tab. Paste patterns from this guide and test them with your own text.
Basic Syntax
Literal Characters
Most characters match themselves. The pattern hello matches the text “hello” exactly.
Metacharacters
Special characters with meaning in regex:
| Metacharacter | Meaning | Example | Matches |
|---|---|---|---|
. | Any character (except newline) | h.t | hat, hit, hot |
\d | Any digit (0-9) | \d\d | 42, 99 |
\w | Word character (a-z, A-Z, 0-9, _) | \w+ | hello, user_1 |
\s | Whitespace (space, tab, newline) | \s+ | spaces, tabs |
\b | Word boundary | \bcat\b | ”cat” but not “category” |
^ | Start of string | ^Hello | ”Hello world” |
$ | End of string | world$ | ”Hello world” |
Quantifiers
Control how many times a pattern repeats:
| Quantifier | Meaning | Example | Matches |
|---|---|---|---|
* | Zero or more | ab*c | ac, abc, abbc |
+ | One or more | ab+c | abc, abbc (not ac) |
? | Zero or one | colou?r | color, colour |
{n} | Exactly n times | \d{4} | 2026 |
{n,m} | Between n and m times | \d{2,4} | 42, 123, 2026 |
Character Classes
Match any one character from a set:
[aeiou] — any vowel
[0-9] — any digit (same as \d)
[a-zA-Z] — any letter
[^0-9] — any non-digit (negation with ^)
Capture Groups
Parentheses () create groups that capture matched text:
Pattern: (\d{4})-(\d{2})-(\d{2})
Text: 2026-03-22
Group 1: 2026
Group 2: 03
Group 3: 22
Groups are numbered from left to right and let you extract specific parts of a match. This is powerful for parsing structured text like dates, logs, and URLs.
Lookahead and Lookbehind
Zero-width assertions that check context without consuming characters:
(?=...) Positive lookahead — "followed by"
(?!...) Negative lookahead — "NOT followed by"
(?<=...) Positive lookbehind — "preceded by"
(?<!...) Negative lookbehind — "NOT preceded by"
Example: Match a number only if followed by “px”:
\d+(?=px)
Matches "16" in "16px", but not "16" in "16em"
Common Patterns
Here are battle-tested patterns for everyday tasks:
- Email Validation — Matching and validating email addresses.
- Phone Numbers — International phone format patterns.
- Regex Cheat Sheet — Quick reference for all regex syntax.
Performance Considerations
Catastrophic Backtracking
Nested quantifiers like (a+)+ can cause exponential time complexity:
Pattern: (a+)+b
Text: aaaaaaaaaaaaaaaaac
Result: Browser freeze! (each 'a' doubles processing time)
Solution: Avoid nested quantifiers. Rewrite as a+b instead.
Use Anchors
Always anchor your patterns when possible. ^\d{5}$ is much faster than \d{5} on large texts because the engine doesn’t need to check every position.
Flag Reference
| Flag | Name | Effect |
|---|---|---|
g | Global | Find all matches, not just the first |
i | Case-insensitive | A matches a |
m | Multiline | ^ and $ match line boundaries |
s | DotAll | . matches newlines |
u | Unicode | Enable Unicode property escapes |
Test all of these: Our Regex Tester supports all flags and shows matched groups in real time.