Regex for Developers: A Practical Guide with Examples

Regular expressions are one of those skills that every developer needs but few take the time to learn properly. You copy a regex from Stack Overflow, it works, and you move on — until the day it doesn't, and you're staring at a wall of symbols with no idea where to start debugging.

This guide builds your regex knowledge from the ground up with practical, real-world examples you can use immediately.

The Building Blocks

At its core, a regex is just a pattern that describes text. The simplest regex is a literal string:

hello     matches "hello" in "say hello world"

But regex becomes powerful when you add special characters called metacharacters:

CharacterMeaningExample
.Any single character (except newline)h.t matches "hat", "hot", "h9t"
^Start of string^Hello matches "Hello world" but not "Say Hello"
$End of stringworld$ matches "hello world" but not "world cup"
\Escape a metacharacter\. matches a literal dot
|OR (alternation)cat|dog matches "cat" or "dog"

Character Classes

Character classes let you match one character from a set. They use square brackets:

[aeiou]       any vowel
[0-9]         any digit
[a-zA-Z]      any letter (upper or lower)
[^0-9]        any character that is NOT a digit

Notice the ^ inside square brackets means "NOT" — the opposite of its meaning outside brackets.

Shorthand Classes

Regex provides shortcuts for common character classes:

ShorthandEquivalentMeaning
\d[0-9]Any digit
\D[^0-9]Any non-digit
\w[a-zA-Z0-9_]Any word character
\W[^a-zA-Z0-9_]Any non-word character
\s[ \t\n\r\f]Any whitespace
\S[^ \t\n\r\f]Any non-whitespace

Tip: Uppercase shorthand classes are always the inverse of their lowercase version. \d matches digits, \D matches everything else.

Quantifiers: How Many?

Quantifiers specify how many times the preceding element should repeat:

QuantifierMeaningExample
*0 or moreab*c matches "ac", "abc", "abbc"
+1 or moreab+c matches "abc", "abbc" but not "ac"
?0 or 1 (optional)colou?r matches "color" and "colour"
{3}Exactly 3\d{3} matches "123" but not "12"
{2,5}Between 2 and 5\d{2,5} matches "12", "123", "12345"
{3,}3 or more\d{3,} matches "123", "1234", etc.

Greedy vs Lazy

By default, quantifiers are greedy — they match as much text as possible. Add a ? to make them lazy (match as little as possible):

# Input: <b>bold</b> and <b>more</b>

<b>.*</b>      greedy: matches "<b>bold</b> and <b>more</b>"
<b>.*?</b>     lazy:   matches "<b>bold</b>" (stops at first </b>)

Common mistake: Using .* when you mean .*?. Greedy matching with .* is the #1 cause of regex patterns matching more text than expected. When in doubt, use the lazy version.

Groups and Capturing

Parentheses () create groups. Groups serve two purposes: they let you apply quantifiers to multi-character sequences, and they capture the matched text for extraction.

# Group + quantifier
(ha)+          matches "ha", "haha", "hahaha"

# Capturing groups (numbered left to right)
(\d{4})-(\d{2})-(\d{2})
# Input: "2026-03-05"
# Group 1: "2026"
# Group 2: "03"
# Group 3: "05"

Non-Capturing Groups

If you need grouping but don't need to capture, use (?:...):

# Non-capturing group (just for alternation)
(?:https?|ftp)://\S+

# Same matching behaviour, but no capture overhead

Named Groups

For readability, name your captures with (?<name>...):

(?<year>\d{4})-(?<month>\d{2})-(?<day>\d{2})

Most languages let you access named groups by name instead of index, making your code much clearer.

Anchors and Boundaries

Anchors don't match characters — they match positions in the string:

AnchorPosition
^Start of string (or line with multiline flag)
$End of string (or line with multiline flag)
\bWord boundary (between \w and \W)
\BNon-word boundary

Word boundaries are incredibly useful for matching whole words:

\bcat\b        matches "cat" but not "concatenate" or "scatter"
\berror\b      matches "error" but not "errors" or "terror"

Lookahead and Lookbehind

Lookarounds let you match based on what comes before or after, without including it in the match:

SyntaxNameMeaning
(?=...)Positive lookaheadFollowed by ...
(?!...)Negative lookaheadNOT followed by ...
(?<=...)Positive lookbehindPreceded by ...
(?<!...)Negative lookbehindNOT preceded by ...
# Match "USD" only when followed by a number
USD(?=\d)           matches "USD" in "USD100" but not "USD only"

# Match a number NOT preceded by a minus sign
(?<!-)\b\d+\b       matches "42" but not "-42"

# Password validation: at least one digit and one uppercase
^(?=.*\d)(?=.*[A-Z]).{8,}$

Tip: Lookaheads and lookbehind are "zero-width" — they check a condition without consuming characters. This means multiple lookaheads can be stacked at the same position, which is why they're perfect for password validation rules.

Real-World Patterns

Here are battle-tested patterns you'll actually use:

Email (simplified)

[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}

This covers the vast majority of real email addresses. A fully RFC-compliant email regex is thousands of characters long and rarely necessary.

IPv4 Address

\b\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\b

Note: this matches the format but doesn't validate the range (0–255). For strict validation, you'd need alternation or post-match checks.

ISO Date (YYYY-MM-DD)

\b\d{4}-(?:0[1-9]|1[0-2])-(?:0[1-9]|[12]\d|3[01])\b

TODO/FIXME Comments

(?://|#)\s*(?:TODO|FIXME|HACK|XXX)\b.*

Semantic Version

\bv?\d+\.\d+\.\d+(?:-[a-zA-Z0-9.]+)?\b

Matches versions like 1.2.3, v2.0.0, and 1.0.0-beta.1.

Common Pitfalls

Quick Reference

PatternDescription
.Any character except newline
\d / \DDigit / non-digit
\w / \WWord char / non-word char
\s / \SWhitespace / non-whitespace
[abc]Character class (a, b, or c)
[^abc]Negated class (not a, b, or c)
* / + / ?0+, 1+, 0 or 1
{n,m}Between n and m times
*? / +?Lazy (minimal) versions
(group)Capturing group
(?:group)Non-capturing group
\bWord boundary
(?=...) / (?!...)Lookahead / negative lookahead
(?<=...) / (?<!...)Lookbehind / negative lookbehind

Test Patterns in Real Time

BoltKit's RegexLab tool lets you write regex patterns and test them against sample text with live highlighting, group extraction, and a library of common patterns. All on your iPhone or iPad.

Get BoltKit Free