Python Regular Expressions
Master regex: search, findall, groups, substitution, and common patterns with re module.
What are Regular Expressions?
Regular expressions (regex) are patterns for matching and extracting text.
Instead of searching for one exact string, you write a pattern that matches any string fitting a
rule — like "any email address" or "any UK phone number". Python's re module provides
all regex functionality. Key pattern characters: \d = digit, \w = word
character, \s = whitespace, . = any character, + = one or
more, * = zero or more, ? = optional.
r"..." for regex patterns. Without r, backslashes must be doubled: \\d instead of \d.search() and match()
re.search() finds the pattern anywhere in the string and returns a match object
(or None). re.match() only matches at the very start. Call
.group() on the match to get the matched text.
import re text = "Contact us at hello@example.com for help" m = re.search(r"\w+@\w+\.\w+", text) if m: print("Found:", m.group()) print("At position:", m.start())
Found: hello@example.com At position: 14
findall() — All Matches
re.findall() returns a list of every non-overlapping match. This is the most commonly
used regex function when you want to extract multiple pieces of data from text.
import re text = "Prices: £10, £25.50, £3.99, £100" print(re.findall(r"£[\d.]+", text)) text2 = "Phone: 020-7946-0321 or 07700-900123" print(re.findall(r"[\d-]+", text2))
['£10', '£25.50', '£3.99', '£100'] ['020-7946-0321', '07700-900123']
Substitution with sub()
re.sub(pattern, replacement, text) replaces every match with the replacement string.
Powerful for cleaning and transforming text data.
import re # Remove all punctuation clean = re.sub(r"[^\w\s]", "", "Hello, World! How are you?") print(clean) # Convert DD/MM/YYYY to YYYY-MM-DD date = re.sub(r"(\d{{2}})/(\d{{2}})/(\d{{4}})", r"\3-\2-\1", "Date: 25/12/2024") print(date)
Hello World How are you Date: 2024-12-25
🧠 Quick Check
Which function returns all matches as a list?