Re Module
Last Updated: 22th August 2025
The re module is used for working with regular expressions in Python. It allows you to search, match, and manipulate text using patterns.
Patterns are:
| Symbol | Description | Example | Matches |
|---|---|---|---|
^ | Matches the start of the string or line | ^Hello |
|
$ | Matches the end of the string or line | end$ |
|
{m} | Matches exactly | a{3} | aaa |
{m,} | Matches at least | a{2,} |
|
{(m, n)} | Matches between | a{(2, 4)} |
|
( ... ) | Groups patterns or captures matched substring | (ab)+ |
|
| | OR operator: matches either pattern on left or right | cat|dog |
|
[abc] | Character class: matches any one character listed | [aeiou] | any vowel |
[^abc] | Negated character class: matches any character not listed | [^0-9] | any non-digit character |
\ | Escape character: matches a literal backslash or special char | \. | a literal dot |
. | Matches any single character except newline | a.b |
|
* | Matches zero or more repetitions of the previous character | ca*t |
|
+ | Matches one or more repetitions of the previous character | go+gle |
|
? | Matches zero or one repetition (makes preceding optional) | colou?r |
|
\b | Matches a word boundary | \bcat\b |
|
\d | Matches any digit (0-9) | \d{} |
|
\D | Matches any non-digit character | \D+ |
|
\w | Matches any alphanumeric character or underscore | \w+ |
|
\W | Matches any non-word character | \W+ |
|
\s | Matches any whitespace character (space, tab, newline) | \s+ | space, tab, newline |
\S | Matches any non-whitespace character | \S+ |
|
List of Functions:
- re.match(pattern, string): searches for a match at the beginning of the string.
import re
text = "123abc"
dataObj = re.match(r"\d+", text)
if match:
print(dataObj.group()) # Output: "123"
print(dataObj.start()) # Output: 0,start position of matched string
print(dataObj.end()) # Output: 3,end position of matched string
- re.search(pattern, string): searches for a match anywhere in the string.
import re
text = "Order number: 12345"
dataObj = re.search(r"\d+", text)
print(dataObj.group())
# Output: "12345"
- flags: re.I,re.M,re.S etc.
import re
text = """Apple
banana
cherry
apple
banana
fruit\\nList
"""
print(re.findall(r"apple", text, re.I)) # ['Apple', 'apple'] ignoring case
print(re.findall(r"^b", text, re.M)) # ['b', 'b'] multiple lines
print(re.findall(r"apple.*", text, re.S)) # ['apple\nbanana\nfruit\\nList\n'] ignoring newlines
- re.findall(pattern, string): returns a list of all matches in the string.
import re
text = "apple 123, banana 456, cherry 789"
numbers = re.findall(r"\d+", text)
print(numbers)
# Output: ["123", "456", "789"]
- re.finditer(pattern, string): returns an iterator over all matches in the string.
import r
text = "A1 B22 C333"
for dataObj in re.finditer(r"\d+", text):
print(dataObj.group(), "at position", dataObj.start(), "-", dataObj.end())
# Output:
#1 at position 1 - 2
#22 at position 3 - 5
#333 at position 6 - 9
- re..sub(pattern, replacement, string): replaces all matches with the replacement string.
import re
text = "I have 2 apples and 3 bananas"
new_text = re.sub(r"\d+", "#", text)
print(new_text)
# Output: "I have # apples and # bananas"
- re.split(pattern, string): splits the string into a list of substrings based on the pattern.
import re
text = "word1, word2; word3:word4"
parts = re.split(r"[,;:]\s*", text)
print(parts)
# Output: ['word1', 'word2', 'word3', 'word4']
- re.compile(pattern): creates a regex object you can reuse for multiple matches.r"..." = raw string (special characters not escaped by Python).
import re
pattern = re.compile(r"\d+") # match one or more digits
result = pattern.findall("My number is 123 and pin is 4567")
print(result)
# Output: ['123', '4567']