Email Validation Regex: Patterns That Actually Work
Email validation is one of the most common regex use cases — and one of the most frequently done wrong. Developers either use patterns that are too strict (rejecting valid addresses like user+tag@example.com) or too permissive (accepting clearly invalid strings). The root cause is that email address syntax, as defined by RFC 5322, is far more complex than most people realize — it allows quoted strings, comments, IP address literals, and even whitespace within the local part.
This guide provides practical, production-ready patterns you can use immediately, explains the trade-offs between strictness and usability, and covers the edge cases that break most validation implementations. Every pattern has been tested against real-world email datasets including international domains, plus-addressed emails, and subdomains.
Test it now: Paste any pattern from this guide into our Regex Tester and try various email addresses in real time.
The Simple Pattern (Recommended for Most Applications)
For most applications, this pattern provides the best balance of accuracy and simplicity:
^[^\s@]+@[^\s@]+\.[^\s@]+$
This matches: one or more non-whitespace, non-@ characters, an @ symbol, one or more non-whitespace, non-@ characters, a dot, and one or more non-whitespace, non-@ characters.
Matches: user@example.com, user+tag@sub.domain.com, 名前@example.jp
Rejects: @example.com, user@, user @example.com
Why this pattern works
The philosophy behind this pattern is deliberate: it validates structure (something@something.something) without making assumptions about what characters are allowed. This is the approach recommended by the W3C’s HTML5 specification for <input type="email"> validation — accept anything that looks structurally like an email, then verify deliverability separately.
The Strict Pattern
For stricter validation (standard ASCII emails only):
^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$
This pattern is used by many web frameworks and libraries. It requires:
- Local part: letters, digits, dots, underscores, percent, plus, hyphen
- Domain: letters, digits, dots, hyphens
- TLD: at least 2 letters
Comparison: Simple vs Strict
| Email Address | Simple Pattern | Strict Pattern | Actually Valid? |
|---|---|---|---|
user@example.com | ✅ | ✅ | ✅ |
user+tag@sub.domain.com | ✅ | ✅ | ✅ |
名前@example.jp | ✅ | ❌ | ✅ (RFC 6531) |
user@123.123.123.123 | ✅ | ❌ | ✅ (IP literal) |
"quoted string"@example.com | ✅ | ❌ | ✅ (RFC 5322) |
user@.example.com | ✅ | ✅ | ❌ (leading dot) |
user@example | ✅ | ❌ | ⚠️ Valid in private networks |
Recommendation: Use the simple pattern for public-facing forms (don’t frustrate users), the strict pattern for internal systems where you control the email format.
The HTML5 Browser Pattern
The W3C HTML5 spec defines this regex for <input type="email">:
^[a-zA-Z0-9.!#$%&'*+/=?^_`{|}~-]+@[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?(?:\.[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?)*$
This is more permissive than the strict pattern (it allows more special characters in the local part) but more restrictive than the simple pattern (it requires ASCII-only domains). It is the default validation applied by browsers when you use <input type="email">.
Patterns to Avoid
The “Perfect” RFC 5322 Pattern
The full RFC 5322 email specification is incredibly complex. The “complete” regex is over 6,000 characters long and matches edge cases like:
"quoted string"@example.com
user@[192.168.1.1]
(comment)user@example.com
Don’t use this in production. It is unmaintainable, impossible to debug, and matches edge-case addresses that most mail servers reject in practice. The effort-to-value ratio is terrible.
Overly Restrictive Patterns
^[a-z]+@[a-z]+\.[a-z]{3}$
This rejects valid addresses like:
user.name@example.com— dots in local partUSER@example.com— uppercase lettersuser+tag@example.com— plus addressing (Gmail, Fastmail)user@example.co.uk— multi-part TLDsuser@example.io— 2-letter TLDsuser123@example.com— digits in local part
[!WARNING] Plus addressing (
user+tag@domain.com) is used by millions of Gmail, Outlook, and Fastmail users for filtering. Rejecting+in the local part will block real customers.
TLD Length Restrictions
\.[a-zA-Z]{2,4}$
This rejects valid TLDs like .museum (6 chars), .technology (10 chars), and .photography (11 chars). Since 2014, ICANN has approved hundreds of new TLDs with varying lengths. Never restrict TLD length to 4 characters.
Language-Specific Implementations
JavaScript
// Simple validation
function isValidEmail(email) {
return /^[^\s@]+@[^\s@]+\.[^\s@]+$/.test(email);
}
// Strict validation with additional checks
function validateEmail(email) {
const trimmed = email.trim().toLowerCase();
if (trimmed.length > 254) {
return { valid: false, error: 'Email exceeds maximum length (254 chars)' };
}
const [local, domain] = trimmed.split('@');
if (!local || local.length > 64) {
return { valid: false, error: 'Local part exceeds maximum length (64 chars)' };
}
if (!domain || domain.length > 253) {
return { valid: false, error: 'Domain exceeds maximum length (253 chars)' };
}
const pattern = /^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$/;
if (!pattern.test(trimmed)) {
return { valid: false, error: 'Invalid email format' };
}
return { valid: true, normalized: trimmed };
}
Python
import re
def is_valid_email(email: str) -> bool:
"""Simple email format validation."""
pattern = r'^[^\s@]+@[^\s@]+\.[^\s@]+$'
return bool(re.match(pattern, email))
def validate_email_strict(email: str) -> dict:
"""Strict validation with length checks per RFC 5321."""
email = email.strip().lower()
if len(email) > 254:
return {'valid': False, 'error': 'Exceeds 254 character limit'}
if '@' not in email:
return {'valid': False, 'error': 'Missing @ symbol'}
local, domain = email.rsplit('@', 1)
if len(local) > 64:
return {'valid': False, 'error': 'Local part exceeds 64 characters'}
pattern = r'^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$'
if not re.match(pattern, email):
return {'valid': False, 'error': 'Invalid format'}
return {'valid': True, 'normalized': email}
Go
import (
"net/mail"
"regexp"
"strings"
)
var emailRegex = regexp.MustCompile(`^[^\s@]+@[^\s@]+\.[^\s@]+$`)
func IsValidEmail(email string) bool {
email = strings.TrimSpace(email)
if len(email) > 254 {
return false
}
// Use Go's built-in RFC 5322 parser
_, err := mail.ParseAddress(email)
return err == nil && emailRegex.MatchString(email)
}
HTML5 (No Regex Needed)
<input type="email" required />
The type="email" attribute provides built-in browser validation using the W3C pattern. This is the best approach for forms because it provides native UI feedback (invalid state styling, validation messages) without JavaScript.
International Email Addresses (EAI / RFC 6531)
Since 2012, the Email Address Internationalization (EAI) standards allow non-ASCII characters in both the local part and the domain:
- Local part:
用户@example.com(Chinese characters) - Domain:
user@例え.jp(internationalized domain name) - Both:
用户@例え.jp
These addresses are valid per RFC 6531 and increasingly common in Asia. The simple pattern (^[^\s@]+@[^\s@]+\.[^\s@]+$) handles them correctly, but the strict ASCII pattern rejects them. If your application serves international users, use the simple pattern or explicitly support Unicode in your regex.
Length Limits You Should Enforce
RFC 5321 defines maximum lengths that your regex cannot enforce (regex is bad at counting). Add these as separate checks:
| Component | Max Length | Standard |
|---|---|---|
| Total email | 254 characters | RFC 5321 §4.5.3.1.3 |
| Local part (before @) | 64 characters | RFC 5321 §4.5.3.1.1 |
| Domain (after @) | 253 characters | RFC 1035 §2.3.4 |
| Domain label (between dots) | 63 characters | RFC 1035 §2.3.4 |
Common Mistakes
1. Not trimming whitespace
Users frequently paste emails with leading/trailing spaces from documents. Always .trim() before validation.
2. Case-sensitive comparison
Email local parts are technically case-sensitive per RFC 5321, but in practice no major provider treats them as such. Always normalize to lowercase before storage and comparison.
3. Rejecting plus addressing
user+newsletter@gmail.com is a valid address that delivers to user@gmail.com. Many users rely on this for email filtering. Never reject + in the local part.
4. Blocking new TLDs
.app, .dev, .shop, .cloud — hundreds of valid TLDs exist beyond .com and .org. Never hardcode a TLD allowlist.
5. Using regex as the only validation
Regex confirms format, not existence. perfectly.valid.format@this-domain-does-not-exist.com passes every regex pattern. Always send a verification email for critical flows (registration, password reset).
The Golden Rule
Regex validates format; confirmation validates existence. The only way to truly verify an email address is to send a confirmation email with a verification link. Use regex as a first-pass filter to catch obvious typos, then verify with email delivery. This two-step approach catches both format errors (regex) and deliverability issues (confirmation).
Frequently Asked Questions
Can regex fully validate an email address according to RFC 5322?
Technically yes, but the complete RFC 5322 regex is over 6,000 characters long and unmaintainable. In practice, a simple structural check combined with a verification email is more reliable and orders of magnitude easier to maintain.
Why does my regex reject user+tag@gmail.com?
Your pattern likely uses [a-zA-Z0-9._-] for the local part, which excludes +. Add + to the character class: [a-zA-Z0-9._%+-]. Plus addressing is a standard feature supported by Gmail, Outlook, Fastmail, and most modern email providers.
Should I validate email on the frontend or backend?
Both. Frontend validation (HTML5 type="email" or JavaScript regex) provides instant feedback. Backend validation (regex + length checks + verification email) ensures security and data integrity. Never trust frontend validation alone — it can be bypassed.
This article is part of our Regular Expressions Guide series.