TL;DR
A regex tester lets you write, debug, and validate regular expressions against sample text in real time. Use our free online regex tester to get instant match highlighting, capture group extraction, and substitution previews for JavaScript, Python, and Go. This guide covers the full regex syntax โ character classes, quantifiers, anchors, named groups, lookahead/lookbehind โ plus language-specific APIs, multiline mode, performance pitfalls, and 8 copy-paste patterns for email, URL, IP, phone, date, hex color, passwords, and slugs.
Regular expressions are one of the most powerful text-processing tools available to developers, yet they are notoriously easy to get wrong. A single missing escape character or a misplaced quantifier can silently match the wrong data or cause a server to hang. An online regex tester eliminates the guesswork by giving you instant visual feedback before you ever deploy code.
This guide is organized into twelve sections. Whether you need a quick syntax reference, a production-ready email pattern, or an explanation of catastrophic backtracking, you can jump directly to the section you need.
1. Regex Syntax Quick Reference
The table below summarises the most commonly used regex building blocks. All examples use standard PCRE-compatible syntax supported by JavaScript, Python, and most other languages.
Character Classes
| Pattern | Matches | Example |
|---|---|---|
| . | Any character except newline | /./ matches "a", "9", "!" |
| \d | Digit [0-9] | /\d+/ matches "42" in "foo42bar" |
| \D | Non-digit | /\D+/ matches "foo" in "foo42bar" |
| \w | Word char [a-zA-Z0-9_] | /\w+/ matches "hello_42" |
| \W | Non-word char | /\W/ matches " " and "-" |
| \s | Whitespace (space, tab, newline) | /\s+/ matches " \t" |
| \S | Non-whitespace | /\S+/ matches "hello" |
| [a-z] | Any lowercase letter | /[a-z]+/ matches "abc" |
| [A-Z] | Any uppercase letter | /[A-Z]+/ matches "ABC" |
| [0-9] | Any digit (same as \d) | /[0-9]+/ matches "123" |
| [abc] | Any of a, b, or c | /[abc]/ matches "b" in "bat" |
| [^abc] | Any char NOT a, b, or c | /[^abc]+/ matches "xyz" |
| [a-zA-Z] | Any letter | /[a-zA-Z]+/ matches "Hello" |
Quantifiers
| Quantifier | Meaning | Lazy version |
|---|---|---|
| * | Zero or more (greedy) | *? |
| + | One or more (greedy) | +? |
| ? | Zero or one | ?? |
| {n} | Exactly n times | {n}? (no effect) |
| {n,} | n or more | {n,}? |
| {n,m} | Between n and m | {n,m}? |
Anchors
| Anchor | Matches |
|---|---|
| ^ | Start of string (or line in multiline mode) |
| $ | End of string (or line in multiline mode) |
| \b | Word boundary (between \w and \W) |
| \B | Non-word boundary |
| \A | Start of entire string (Python only) |
| \Z | End of entire string (Python only) |
Groups and Alternation
| Syntax | Meaning |
|---|---|
| (abc) | Capturing group โ captures "abc" as group 1 |
| (?:abc) | Non-capturing group โ groups without capturing |
| (?<name>abc) | Named capturing group (JS/Python) โ access via groups.name |
| (?P<name>abc) | Named capturing group (Python alternative syntax) |
| (?=abc) | Positive lookahead โ must be followed by "abc" |
| (?!abc) | Negative lookahead โ must NOT be followed by "abc" |
| (?<=abc) | Positive lookbehind โ must be preceded by "abc" |
| (?<!abc) | Negative lookbehind โ must NOT be preceded by "abc" |
| a|b | Alternation โ matches "a" or "b" |
Special Characters
\. Literal dot (escape . to match it literally)
\n Newline character
\r Carriage return
\t Tab character
\0 Null character
\\ Literal backslash
\( Literal opening parenthesis (escape special chars)
\[ Literal opening bracket2. Common Regex Patterns (Copy-Paste Ready)
The patterns below are production-tested and cover the most common validation and extraction tasks. Test them instantly in our free regex tester.
Email Address
/^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$/
// Matches: user@example.com, first.last+tag@sub.domain.io
// Misses intentionally: IP-address emails, quoted local partsURL
/(https?:\/\/)?([\/\da-z.-]+)\.([a-z.]{2,6})([/\w .-]*)*\/?/i
// Matches: http://example.com, https://sub.domain.co.uk/path?q=1
// Use URL constructor for strict validation in productionIPv4 Address
/^(\d{1,3}\.){3}\d{1,3}$/
// Fast syntax check only โ does not validate range (0-255)
// For strict validation add: each octet (?:25[0-5]|2[0-4]\d|[01]?\d\d?)
const ipStrict = /^(?:25[0-5]|2[0-4]\d|[01]?\d\d?)(?:\.(?:25[0-5]|2[0-4]\d|[01]?\d\d?)){3}$/;US Phone Number
/^\+?1?\s?\(?\d{3}\)?[-.\s]?\d{3}[-.\s]?\d{4}$/
// Matches: (555) 123-4567, +1 555.123.4567, 5551234567
// Does not match international formats outside North AmericaISO 8601 Date (YYYY-MM-DD)
/^\d{4}-(0[1-9]|1[0-2])-(0[1-9]|[12]\d|3[01])$/
// Matches: 2026-02-27, 2000-12-31
// Does not validate month/day combinations (Feb 30 would pass syntax check)Hex Color
/^#([0-9a-fA-F]{3}|[0-9a-fA-F]{6})$/
// Matches: #fff, #FFF, #1a2b3c
// Extend to support 4/8 digit forms: {3,4}|{6,8}Strong Password
/^(?=.*[a-z])(?=.*[A-Z])(?=.*\d)(?=.*[@$!%*?&])[A-Za-z\d@$!%*?&]{8,}$/
// Requires: 8+ chars, at least one lowercase, one uppercase,
// one digit, and one special character (@$!%*?&)
// Uses four lookaheads โ each checked independentlyURL Slug
/^[a-z0-9]+(?:-[a-z0-9]+)*$/
// Matches: hello-world, my-blog-post-2026
// Rejects: -leading-dash, double--dash, UPPERCASE3. JavaScript Regex API
JavaScript has first-class regex support built into the language. Patterns can be written as literals or constructed dynamically.
Literal vs Constructor
// Literal โ compiled once at parse time, preferred for static patterns
const re = /\d+/g;
// Constructor โ use when the pattern is dynamic (user input, config)
const pattern = '\\d+';
const re2 = new RegExp(pattern, 'g'); // note: double-escape in stringsCore Methods
const str = 'Order: 42 items, total: 189';
// .test() โ returns boolean, fastest for existence check
/\d+/.test(str); // true
// .exec() โ returns first match array with groups, or null
/\d+/g.exec(str); // ['42', index: 7, ...]
// str.match() โ without g: first match + groups; with g: all matches (no groups)
str.match(/\d+/); // ['42', index: 7, ...]
str.match(/\d+/g); // ['42', '189']
// str.matchAll() โ iterator of all matches with groups (requires g flag)
const matches = [...str.matchAll(/\d+/g)];
// [['42', index:7], ['189', index:24]]
// str.replace() โ replace first match (no g) or all (with g)
str.replace(/\d+/, 'N'); // 'Order: N items, total: 189'
str.replace(/\d+/g, 'N'); // 'Order: N items, total: N'
// str.replaceAll() โ replaces all occurrences; pattern must have g flag if regex
str.replaceAll(/\d+/g, 'N'); // 'Order: N items, total: N'
// str.split() โ split on regex delimiter
'a1b2c3'.split(/\d/); // ['a', 'b', 'c', '']Flags Reference
| Flag | Name | Effect |
|---|---|---|
| g | global | Find all matches, not just the first |
| i | ignoreCase | Case-insensitive matching |
| m | multiline | ^ and $ match line start/end |
| s | dotAll | . matches newlines too |
| u | unicode | Enable full Unicode support |
| d | hasIndices | Add .indices to match results (ES2022) |
| v | unicodeSets | Enhanced Unicode sets (ES2024) |
4. Named Capture Groups
Named capture groups were introduced in ES2018 for JavaScript and have long been available in Python. They make patterns self-documenting and protect against index-shifting when you add or remove groups later.
// Pattern with three named groups
const dateRe = /(?<year>\d{4})-(?<month>0[1-9]|1[0-2])-(?<day>0[1-9]|[12]\d|3[01])/;
const m = '2026-02-27'.match(dateRe);
if (m) {
const { year, month, day } = m.groups;
console.log(year, month, day); // '2026', '02', '27'
}
// Using named groups in replaceAll (reference with $<name>)
const reformat = (iso) =>
iso.replace(
/(?<year>\d{4})-(?<month>\d{2})-(?<day>\d{2})/,
'$<day>/$<month>/$<year>' // rearrange to DD/MM/YYYY
);
console.log(reformat('2026-02-27')); // '27/02/2026'
// Using a function in replace for complex transformations
const result = '2026-02-27'.replace(
/(?<year>\d{4})-(?<month>\d{2})-(?<day>\d{2})/,
(_, year, month, day, offset, input, groups) =>
`${groups.day} ${groups.month} ${groups.year}`
);
console.log(result); // '27 02 2026'Tip: Named groups are also available in matchAll() โ each iteration object exposes a groups property.
const text = 'Dates: 2026-01-01 and 2026-06-15';
for (const match of text.matchAll(/(?<year>\d{4})-(?<month>\d{2})-(?<day>\d{2})/g)) {
console.log(match.groups); // { year: '2026', month: '01', day: '01' } ...
}5. Lookahead and Lookbehind
Lookahead and lookbehind are zero-width assertions โ they check surrounding context without consuming characters. This makes them ideal for conditional matching.
Positive Lookahead โ (?=...)
// Match "foo" only when followed by "bar"
/foo(?=bar)/.test('foobar'); // true
/foo(?=bar)/.test('foobaz'); // false
// Password: must contain at least one digit (lookahead doesn't consume)
/^(?=.*\d).{8,}$/.test('password1'); // true
/^(?=.*\d).{8,}$/.test('password'); // falseNegative Lookahead โ (?!...)
// Match "foo" NOT followed by "bar"
/foo(?!bar)/.test('foobaz'); // true
/foo(?!bar)/.test('foobar'); // false
// Match any word not followed by a digit
/\b\w+(?!\d)\b/gPositive Lookbehind โ (?<=...)
// Match digits preceded by "$"
const prices = 'Price: $42 and $189';
const nums = prices.match(/(?<=\$)\d+/g);
console.log(nums); // ['42', '189']
// Extract value after "key:" in a config string
const val = 'port: 3000'.match(/(?<=port: )\d+/)?.[0];
console.log(val); // '3000'Negative Lookbehind โ (?<!...)
// Match digits NOT preceded by "$"
'Price: $42 count: 7'.match(/(?<!\$)\d+/g); // ['7']
// Strong password: ensure the string doesn't START with a digit
// (negative lookbehind at position 0)
/^(?<!\d)(?=.*[A-Z])(?=.*\d).{8,}$/.test('Pass1word'); // true
/^(?<!\d)(?=.*[A-Z])(?=.*\d).{8,}$/.test('1Password'); // falseBrowser support note: Lookbehind is supported in all modern browsers (Chrome 62+, Firefox 78+, Safari 16.4+). Go's RE2 engine does NOT support any form of lookbehind.
6. Python Regex with the re Module
Python's built-in re module provides a complete regex API. The main difference from JavaScript is that Python uses string flags and slightly different named group syntax.
import re
text = "Order: 42 items, total: 189"
# re.search() โ find FIRST match anywhere in string
m = re.search(r'\d+', text)
print(m.group()) # '42'
print(m.start()) # 7
# re.match() โ match at START of string only
m = re.match(r'Order', text)
print(bool(m)) # True
# re.fullmatch() โ entire string must match
re.fullmatch(r'\d+', '12345') # Match object
re.fullmatch(r'\d+', '123x') # None
# re.findall() โ return list of all matches
re.findall(r'\d+', text) # ['42', '189']
# re.finditer() โ iterator of match objects (more info than findall)
for m in re.finditer(r'\d+', text):
print(m.group(), m.start(), m.end())
# re.sub() โ replace matches
re.sub(r'\d+', 'N', text) # 'Order: N items, total: N'
# re.compile() โ compile for reuse (significant speedup in loops)
pattern = re.compile(r'\d+', re.IGNORECASE)
pattern.findall(text)Python Flags
re.IGNORECASE (re.I) # Case-insensitive
re.MULTILINE (re.M) # ^ and $ match line boundaries
re.DOTALL (re.S) # . matches newlines
re.VERBOSE (re.X) # Allow whitespace and comments in pattern
re.UNICODE (re.U) # Default in Python 3, enables \w to match Unicode
re.ASCII (re.A) # Restrict \w, \d etc. to ASCII range
# Combine flags with |
re.compile(r'pattern', re.I | re.M)Named Groups in Python
# Python uses (?P<name>...) syntax
date_re = re.compile(r'(?P<year>\d{4})-(?P<month>\d{2})-(?P<day>\d{2})')
m = date_re.search('Today is 2026-02-27')
if m:
print(m.group('year')) # '2026'
print(m.groupdict()) # {'year': '2026', 'month': '02', 'day': '27'}
# Use in re.sub with \g<name> back-reference
date_re.sub(r'\g<day>/\g<month>/\g<year>', '2026-02-27')
# Returns '27/02/2026'7. Multiline Mode
By default, ^ matches the start of the entire string and $ matches the end. In multiline mode, they match the start and end of each line. This is essential for processing files, logs, and config blocks.
// JavaScript โ m flag
const log = `[INFO] Server started
[ERROR] Connection refused
[INFO] Retry attempt 1`;
// Without m flag โ ^ only matches very start of string
log.match(/^\[ERROR\].*/) // null (doesn't start at position 0)
// With m flag โ ^ matches after each newline
log.match(/^\[ERROR\].*/m) // ['[ERROR] Connection refused']
log.match(/^\[ERROR\].*/mg) // all ERROR lines# Python โ re.MULTILINE
import re
log = """[INFO] Server started
[ERROR] Connection refused
[INFO] Retry attempt 1"""
errors = re.findall(r'^\[ERROR\].*', log, re.MULTILINE)
print(errors) # ['[ERROR] Connection refused']
# Combining MULTILINE and DOTALL
# re.MULTILINE: ^ and $ per line
# re.DOTALL: . matches \n
# These are independent โ use both when needed
block_re = re.compile(r'^START.*?END$', re.MULTILINE | re.DOTALL)Practical: Extract Log Entries
// Extract all timestamps and levels from an access log
const logPattern = /^(?<timestamp>\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}) (?<level>INFO|WARN|ERROR) (?<message>.+)$/mg;
const entries = [...log.matchAll(logPattern)].map(m => ({
timestamp: m.groups.timestamp,
level: m.groups.level,
message: m.groups.message,
}));8. Performance Tips and Avoiding Catastrophic Backtracking
A poorly written regex can take exponential time to evaluate, effectively hanging your application. This is called catastrophic backtracking or ReDoS (Regular Expression Denial of Service).
The Danger Pattern: Nested Quantifiers
// DANGEROUS โ exponential backtracking on 'aaaaaab'
/(a+)+$/.test('aaaaaab') // hangs on long non-matching strings
// Why: the outer + and inner + both expand 'a' differently
// For 'aaaa', the engine tries: (aaaa), (aaa)(a), (aa)(aa), (a)(aaa)... etc.
// SAFE alternative โ use an anchor to prevent backtracking space
/^a+$/.test('aaaaaab') // false, fast (fails at 'b' immediately)Performance Best Practices
| Rule | Why |
|---|---|
| Anchor your patterns with ^ and $ | Prevents the engine from trying every position in the string |
| Prefer specific character classes over . | /[a-z]+/ is faster than /.+/ for letter-only data |
| Use lazy quantifiers (*?, +?) only when needed | Greedy is often faster when anchored correctly |
| Avoid nested quantifiers on overlapping patterns | (a+)+ on non-matching input causes exponential time |
| Compile patterns outside loops | re.compile() in Python, const re = /pattern/ at module level in JS |
| Test with ReDoS checkers | Tools like vuln-regex-detector or regex101 flag catastrophic patterns |
| Use atomic groups (?>...) in PHP/PCRE | Prevents backtracking into the group once it matches |
| Use possessive quantifiers in Java/Perl | /a++/ prevents backtracking into the quantifier |
9. Regex in Other Languages
Go โ regexp Package (RE2 Engine)
Go uses the RE2 engine, which guarantees linear time matching but does NOT support lookahead, lookbehind, or backreferences. This is a deliberate safety tradeoff.
package main
import (
"fmt"
"regexp"
)
func main() {
// Compile โ panics on invalid pattern, use MustCompile for static patterns
re := regexp.MustCompile(`\d+`)
// Compile with error handling for dynamic patterns
re2, err := regexp.Compile(`\d+`)
if err != nil { /* handle */ }
_ = re2
text := "Order: 42 items, total: 189"
// MatchString โ equivalent to test()
fmt.Println(re.MatchString(text)) // true
// FindString โ first match
fmt.Println(re.FindString(text)) // "42"
// FindAllString โ all matches
fmt.Println(re.FindAllString(text, -1)) // ["42", "189"]
// FindStringSubmatch โ first match + groups
re3 := regexp.MustCompile(`(\d+)-(\d+)`)
m := re3.FindStringSubmatch("Range: 10-99")
fmt.Println(m) // ["10-99", "10", "99"]
// Named groups โ use (?P<name>...)
re4 := regexp.MustCompile(`(?P<year>\d{4})-(?P<month>\d{2})`)
match := re4.FindStringSubmatch("2026-02")
names := re4.SubexpNames()
for i, name := range names {
if name != "" && i < len(match) {
fmt.Printf("%s: %s\n", name, match[i])
}
}
// ReplaceAllString
result := re.ReplaceAllString(text, "N")
fmt.Println(result) // "Order: N items, total: N"
}Rust โ regex Crate
The Rust regex crate also uses RE2 semantics (no lookahead/lookbehind) for guaranteed linear-time matching.
use regex::Regex;
fn main() {
let re = Regex::new(r"\d+").unwrap();
// is_match โ boolean test
println!("{}", re.is_match("foo42")); // true
// find โ first match with position
if let Some(m) = re.find("foo42bar") {
println!("{}", m.as_str()); // "42"
println!("{}", m.start()); // 3
}
// find_iter โ iterator of all matches
for mat in re.find_iter("42 items and 189 units") {
println!("{}", mat.as_str());
}
// captures โ first match with groups
let date_re = Regex::new(r"(?P<year>\d{4})-(?P<month>\d{2})-(?P<day>\d{2})").unwrap();
if let Some(caps) = date_re.captures("Date: 2026-02-27") {
println!("{}", &caps["year"]); // 2026
println!("{}", &caps["month"]); // 02
}
// replace / replace_all
let result = re.replace_all("42 and 189", "N");
println!("{}", result); // "N and N"
}Java โ java.util.regex
import java.util.regex.*;
public class RegexDemo {
public static void main(String[] args) {
String text = "Order: 42 items, total: 189";
// Quick match check
System.out.println(text.matches(".*\\d+.*")); // true
// .matches() tests the ENTIRE string
// Pattern + Matcher for more control
Pattern p = Pattern.compile("\\d+");
Matcher m = p.matcher(text);
// find() moves to next match, group() returns it
while (m.find()) {
System.out.println(m.group() + " at " + m.start());
}
// Named groups (Java 7+)
Pattern dp = Pattern.compile("(?<year>\\d{4})-(?<month>\\d{2})-(?<day>\\d{2})");
Matcher dm = dp.matcher("2026-02-27");
if (dm.matches()) {
System.out.println(dm.group("year")); // 2026
}
// replaceAll
System.out.println(text.replaceAll("\\d+", "N"));
}
}10. Regex for Text Processing Tasks
Log File Parsing
// Apache/Nginx combined log format
const apacheLogRe = /^(?<ip>[\d.]+) \S+ \S+ \[(?<time>[^\]]+)\] "(?<method>\w+) (?<path>[^ "]+)[^"]*" (?<status>\d{3}) (?<size>\d+|-)/mg;
const entries = [...log.matchAll(apacheLogRe)].map(m => m.groups);
const errors = entries.filter(e => e.status.startsWith('5'));Markdown Link Extraction
// Extract [text](url) markdown links
const mdLinkRe = /\[(?<text>[^\]]+)\]\((?<url>[^)]+)\)/g;
const markdown = 'See [DevToolBox](https://viadreams.cc) and [docs](https://docs.example.com)';
const links = [...markdown.matchAll(mdLinkRe)].map(m => ({
text: m.groups.text,
url: m.groups.url,
}));
// [{text:'DevToolBox', url:'https://viadreams.cc'}, ...]HTML Attribute Extraction (with Caveats)
// Extract href values โ works for simple, well-formed HTML only
const hrefRe = /href=["'](?<url>[^"']+)["']/gi;
// WARNING: Regex is NOT a substitute for a proper HTML parser.
// Nested quotes, CDATA sections, comments, and malformed HTML
// will all break simple regex approaches.
// Use DOMParser, cheerio, or htmlparser2 for production HTML parsing.
// Safe use case: extracting from KNOWN, CONTROLLED template output
const template = '<a href="/about">About</a>';
const href = template.match(/href="([^"]+)"/)?.[1]; // '/about'Code Comment Removal
// Remove single-line // comments (naive โ breaks on // inside strings)
code.replace(/\/\/.*$/mg, '');
// Remove /* block */ comments (non-greedy to avoid over-matching)
code.replace(/\/\*[\s\S]*?\*\//g, '');
// NOTE: For production code stripping, use a proper AST parser
// (Babel, esprima, acorn) โ regex cannot handle all edge cases
// like // inside string literals or nested block comments.CSV Parsing Pitfalls
// Simple CSV split โ FAILS on quoted fields containing commas
'a,b,c'.split(','); // ['a', 'b', 'c'] โ
// Quoted field with comma โ simple split fails
'"hello, world",foo,bar'.split(','); // ['"hello', ' world"', 'foo', 'bar'] โ
// Better regex for quoted CSV fields
const csvFieldRe = /(?:^|,)(?:"([^"]*(?:""[^"]*)*)"|([^,]*))/g;
// Still not RFC 4180 compliant โ use Papa Parse or csv-parse for production11. Testing Strategies for Regex Patterns
A regex that works on your test case may fail on edge cases in production. Use our online regex tester to systematically test all of the following categories.
Email Regex Test Cases
| Input | Expected | Notes |
|---|---|---|
| user@example.com | MATCH | Standard email |
| first.last@domain.co.uk | MATCH | Subdomain + country TLD |
| user+tag@example.org | MATCH | Plus-addressing |
| user@subdomain.example.com | MATCH | Subdomain |
| @example.com | NO MATCH | Missing local part |
| user@ | NO MATCH | Missing domain |
| user@.com | NO MATCH | Domain starts with dot |
| user@example | NO MATCH | No TLD |
| user @example.com | NO MATCH | Space in local part |
| "user@name"@example.com | DEPENDS | Quoted local parts โ RFC allows, most regex reject |
| user@[192.168.1.1] | DEPENDS | IP address domain โ technically valid per RFC |
| NO MATCH | Empty string | |
| aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa@example.com | DEPENDS | Very long local part |
General Testing Checklist
- Empty string: Does your pattern handle
""correctly? - Minimum valid input: Single character, shortest possible match.
- Maximum valid input: Very long strings โ check performance.
- Boundary conditions: Exactly at min/max length limits.
- Unicode: Emojis, accented characters, CJK โ does
\wbehave as expected? - Newlines: Does
.match\n? Do you need thesflag? - Non-matching input: Confirm false positives don't sneak through.
- Anchoring: Test
"xpatternx"to confirm anchors prevent partial matches.
12. Common Regex Mistakes and How to Fix Them
Mistake 1: Forgetting to Escape the Dot
// WRONG โ matches "3.14" but also "3X14", "3914"
/3.14/.test('3X14'); // true (. matches any char)
// RIGHT
/3\.14/.test('3X14'); // false
/3\.14/.test('3.14'); // trueMistake 2: Greedy When You Need Lazy
const html = '<b>bold</b> and <i>italic</i>';
// WRONG โ greedy .* matches as much as possible
html.match(/<.+>/)?.[0]; // '<b>bold</b> and <i>italic</i>'
// RIGHT โ lazy .*? stops at first >
html.match(/<.+?>/)?.[0]; // '<b>'
html.match(/<.+?>/g); // ['<b>', '</b>', '<i>', '</i>']Mistake 3: Not Anchoring Validation Patterns
// WRONG โ passes "abc123xyz" because \d+ matches "123" anywhere
/\d+/.test('abc123xyz'); // true โ not a validation pattern!
// RIGHT โ validate the ENTIRE input
/^\d+$/.test('abc123xyz'); // false
/^\d+$/.test('123'); // trueMistake 4: Forgetting the g Flag in JavaScript
const text = 'foo bar baz';
// WRONG โ only replaces first match
text.replace(/\b\w+\b/, 'X'); // 'X bar baz'
// RIGHT โ global flag replaces all
text.replace(/\b\w+\b/g, 'X'); // 'X X X'
// Also affects match() โ without g, returns match object
text.match(/\w+/); // ['foo', index: 0, ...]
text.match(/\w+/g); // ['foo', 'bar', 'baz']Mistake 5: Catastrophic Backtracking with (a+)+
// DANGEROUS โ exponential time on non-matching input
const evil = /(a+)+b/;
// Test with progressively longer strings to see performance degrade:
evil.test('aaab'); // fast
evil.test('aaaaaaaaaab'); // slower
evil.test('aaaaaaaaaaaaaaaaaaaab'); // very slow
evil.test('aaaaaaaaaaaaaaaaaaaac'); // HANGS โ no 'b' to end the match
// FIX: Rewrite to eliminate ambiguity
// Option 1: atomic group (not in JS/Python)
// Option 2: possessive quantifier (not in JS/Python)
// Option 3: restructure to remove nested quantifiers
/a+b/.test('aaab'); // works correctly, no ambiguityMistake 6: Unicode Issues with \\w and \\d
// In JavaScript (without u flag), \w = [a-zA-Z0-9_] โ ASCII only
/^\w+$/.test('cafรฉ'); // false โ accented char not in \w
/^\w+$/.test('hello'); // true
// The u flag enables Unicode matching for some features but \w is still ASCII in JS
// Use explicit ranges for Unicode letter matching:
/^[\p{L}\p{N}]+$/u.test('cafรฉ'); // true (with u flag and \p Unicode category)
// Python with re.UNICODE (default in Python 3):
// \w matches all Unicode word characters including accented letters
import re
bool(re.match(r'^\w+$', 'cafรฉ')) # True in Python 3Key Takeaways
- Use an online regex tester to get instant visual feedback before adding patterns to your codebase.
- Always anchor validation patterns with
^and$to prevent partial matches. - Named capture groups (
(?<name>...)) make patterns self-documenting and protect against index shifts. - Lookahead/lookbehind are zero-width assertions โ they check context without consuming characters, perfect for password validation.
- Avoid nested quantifiers on overlapping patterns like
(a+)+โ they cause catastrophic backtracking on non-matching input. - Go uses RE2 (no lookahead/lookbehind); JavaScript and Python support full PCRE-compatible features.
- Compile patterns outside loops with
re.compile()(Python) or module-level literals (JavaScript) for maximum performance. - Always add the g flag in JavaScript when you want to replace or match all occurrences, not just the first.
Ready to build and test your own patterns? Open the free Regex Tester โ no signup required, instant match highlighting, and a built-in cheat sheet.