- Lab
- Core Tech

Guided: Using Regular Expressions in Python
Unlock the power of regular expressions in Python with this hands-on lab designed to build your confidence and capability in text manipulation and data extraction. Whether you're parsing log files, validating user input, or cleaning messy datasets, mastering regex can streamline your workflow and enhance your code's precision. This lab walks you through the core syntax, including essential symbols like \d, \w, and \s, as well as anchors and quantifiers for targeted pattern matching. You'll gain practical experience using Python's match(), search(), and findall() functions to locate data, and learn how to validate common formats such as emails and dates. Finally, you'll manipulate text using powerful substitution and splitting techniques—essential tools for real-world tasks like analyzing CSV-style strings or cleaning logs. Perfect for developers, data analysts, and anyone looking to boost their pattern recognition skills, this lab makes regex approachable, applicable, and immediately useful.

Path Info
Table of Contents
-
Challenge
Introduction
Welcome to the Guided: Using Regular Expressions in Python Lab
In this lab, you will be provided with an environment and step-by-step instructions to help you:
- Create proper regular expressions with core regex syntax and patterns
- Perform text search and extraction with Python's
re
module - Manipulate text through substitution and splitting with Python's
re
module - Apply regex for real world data validation and extraction
Prerequisites
You should have a basic understanding of Python, including how to write methods and instantiate variables. No prior experience with regular expressions is required.
Throughout the lab, you will run Python commands in the Terminal window your task implementations. All commands should be run from the
workspace
directory and will follow this structure:python3 regex_utils.py step<step_number> task<task_number> <text_to_perform_regex_on> <prefix_or_suffix_if_applicable>
Tip: If you need assistance at any point, you can refer to the
/solution
directory. It contains subdirectories for each of the steps with example implementations.
-
Challenge
Text Search and Extraction
Regular Expression Syntax
A regular expression (regex or regexp) is a powerful tool for describing and matching patterns in text. It is a sequence of characters that defines a search pattern. It's used for matching, locating, and managing text. Think of it like a super-powered search tool that can describe very complex text patterns.
You typically use regular expressions in:
- Searching for specific text within a document or a string.
- Validating data inputs (like checking if an email address is properly formatted).
- Replacing parts of text.
- Splitting text into parts based on patterns.
This is done following a very specific syntax that contains symbols, anchors, quantifiers, special characters, lookaheads, and lookbehinds.
Regular Expression Syntax Table Quick Reference
| Symbol | Category | Meaning / Description |
|--------------------|-----------------------|----------------------------------------------------------------------------------| |
.
| Wildcard | Matches any single character except newline | |\d
| Character class | Matches any digit (same as[0-9]
) | |\D
| Character class | Matches any non-digit | |\w
| Character class | Matches any word character (letters, digits, underscore) | |\W
| Character class | Matches any non-word character | |\s
| Character class | Matches any whitespace (space, tab, newline) | |\S
| Character class | Matches any non-whitespace character | |[...]
| Character set | Matches any one character inside brackets | |[^...]
| Negated set | Matches any character not inside brackets | ||
| Alternation | OR operator; matches either the left or right pattern | |()
| Grouping | Groups expressions, enables capturing or combining parts | |(?:...)
| Non-capturing group | Groups pattern but doesn't capture it | |(?P<name>...)
| Named group | Captures a group with a name | |\b
| Anchor | Word boundary (between word and non-word character) | |\B
| Anchor | Not a word boundary | |^
| Anchor | Matches the start of the string (or line with multiline flag) | |$
| Anchor | Matches the end of the string (or line with multiline flag) | |*
| Quantifier | Matches 0 or more repetitions | |+
| Quantifier | Matches 1 or more repetitions | |?
| Quantifier | Matches 0 or 1 (makes preceding token optional) | |{n}
| Quantifier | Matches exactly n repetitions | |{n,}
| Quantifier | Matches n or more repetitions | |{n,m}
| Quantifier | Matches between n and m repetitions | |?
after quant. | Lazy modifier | Makes quantifier non-greedy (match as little as possible) | |(?=...)
| Lookahead (positive) | Match if followed by pattern (doesn't include it in result) | |(?!...)
| Lookahead (negative) | Match if not followed by pattern | |(?<=...)
| Lookbehind (positive) | Match if preceded by pattern | |(?<!...)
| Lookbehind (negative) | Match if not preceded by pattern | |\\
| Escape | Escapes a special character (e.g.,\.
matches a literal dot) |
Python's
re
ModulePython's
re
module contains several helpful methods used for completing text search, extraction, and manipulation with the help of regex.`re` Module Methods Quick Reference
| Method | Purpose | Parameters | Returns | Notes |
|----------------|-------------------------------------------|-------------------------------------------|--------------------------|----------------------------------------------------| |
match()
| Match pattern at the start of string |pattern
,string
,flags=0
| Match object orNone
| Good for "does this string start with..." | |search()
| Search anywhere in string |pattern
,string
,flags=0
| Match object orNone
| Finds first occurrence | |fullmatch()
| Match the entire string |pattern
,string
,flags=0
| Match object orNone
| Use for strict validation | |findall()
| Find all non-overlapping matches |pattern
,string
,flags=0
| List of strings or tuples| Usefinditer()
for match objects instead | |finditer()
| Iterate over all matches as objects |pattern
,string
,flags=0
| Iterator of Match objects| Useful for position info, grouping, etc. | |sub()
| Replace pattern with replacement string |pattern
,repl
,string
,count=0
| New string | Use for find-and-replace | |subn()
| Likesub()
, but also returns count |pattern
,repl
,string
,count=0
| Tuple: (string, count) | Great for auditing replacements | |split()
| Split string by pattern |pattern
,string
,maxsplit=0
| List of strings | Smarter thanstr.split()
| |compile()
| Compile pattern for reuse |pattern
,flags=0
| Compiled pattern object | Improves performance with repeated use | |escape()
| Escape special regex chars in input |string
| Escaped string | Use when inserting user input into regex safely |
Text Search and Extraction with Regular Expressions
In the upcoming tasks, you will have the opportunity to use Python's
re
module to search for pieces of text that match the what you are looking for. This will require writing regular expressions with core regular expression syntax such as symbols, anchors, and quantifiers.Tip: In Python, regex patterns are usually written as raw strings by prefixing them with r, like r"\d+", so that backslashes are treated correctly.
-
Challenge
Text Manipulation
Match Objects
A match object is a special object returned by some
re
module methods. It provides useful information about the match, such as the matched text, its position in the original string, and any captured groups.Match Object Methods & Properties Quick Reference
| Property / Method | Description | |---------------------|-----------------------------------------------------------------------------| |
.group()
| Returns the entire match (or a specific group if passed an index) | |.groups()
| Returns a tuple of all captured groups (excluding named groups) | |.groupdict()
| Returns a dictionary of all named capturing groups | |.start()
| Returns the start index of the match | |.end()
| Returns the end index (1 past the last character) of the match | |.span()
| Returns a tuple(start, end)
representing the range of the match | |.pos
| The starting position of the search within the string | |.endpos
| The ending position (limit) of the search | |.re
| The regular expression object used for the match | |.string
| The original string passed tore.search()
or similar | |.lastgroup
| The name of the last matched capturing group | |.lastindex
| The index of the last matched capturing group (by number) |Text Manipulation with Regular Expressions
In the upcoming tasks, you will use Python’s
re
module to manipulate text by substituting specific patterns with new text and splitting text based on defined patterns. You may also work with match objects to extract additional information about matches.Regular Expression Syntax Table Quick Reference
| Symbol | Category | Meaning / Description |
|--------------------|-----------------------|----------------------------------------------------------------------------------| |
.
| Wildcard | Matches any single character except newline | |\d
| Character class | Matches any digit (same as[0-9]
) | |\D
| Character class | Matches any non-digit | |\w
| Character class | Matches any word character (letters, digits, underscore) | |\W
| Character class | Matches any non-word character | |\s
| Character class | Matches any whitespace (space, tab, newline) | |\S
| Character class | Matches any non-whitespace character | |[...]
| Character set | Matches any one character inside brackets | |[^...]
| Negated set | Matches any character not inside brackets | ||
| Alternation | OR operator; matches either the left or right pattern | |()
| Grouping | Groups expressions, enables capturing or combining parts | |(?:...)
| Non-capturing group | Groups pattern but doesn't capture it | |(?P<name>...)
| Named group | Captures a group with a name | |\b
| Anchor | Word boundary (between word and non-word character) | |\B
| Anchor | Not a word boundary | |^
| Anchor | Matches the start of the string (or line with multiline flag) | |$
| Anchor | Matches the end of the string (or line with multiline flag) | |*
| Quantifier | Matches 0 or more repetitions | |+
| Quantifier | Matches 1 or more repetitions | |?
| Quantifier | Matches 0 or 1 (makes preceding token optional) | |{n}
| Quantifier | Matches exactly n repetitions | |{n,}
| Quantifier | Matches n or more repetitions | |{n,m}
| Quantifier | Matches between n and m repetitions | |?
after quant. | Lazy modifier | Makes quantifier non-greedy (match as little as possible) | |(?=...)
| Lookahead (positive) | Match if followed by pattern (doesn't include it in result) | |(?!...)
| Lookahead (negative) | Match if not followed by pattern | |(?<=...)
| Lookbehind (positive) | Match if preceded by pattern | |(?<!...)
| Lookbehind (negative) | Match if not preceded by pattern | |\\
| Escape | Escapes a special character (e.g.,\.
matches a literal dot) |<details><summary>`re` Module Methods Quick Reference</summary> | Method | Purpose | Parameters | Returns | Notes |
|----------------|-------------------------------------------|-------------------------------------------|--------------------------|----------------------------------------------------| |
match()
| Match pattern at the start of string |pattern
,string
,flags=0
| Match object orNone
| Good for "does this string start with..." | |search()
| Search anywhere in string |pattern
,string
,flags=0
| Match object orNone
| Finds first occurrence | |fullmatch()
| Match the entire string |pattern
,string
,flags=0
| Match object orNone
| Use for strict validation | |findall()
| Find all non-overlapping matches |pattern
,string
,flags=0
| List of strings or tuples| Usefinditer()
for match objects instead | |finditer()
| Iterate over all matches as objects |pattern
,string
,flags=0
| Iterator of Match objects| Useful for position info, grouping, etc. | |sub()
| Replace pattern with replacement string |pattern
,repl
,string
,count=0
| New string | Use for find-and-replace | |subn()
| Likesub()
, but also returns count |pattern
,repl
,string
,count=0
| Tuple: (string, count) | Great for auditing replacements | |split()
| Split string by pattern |pattern
,string
,maxsplit=0
| List of strings | Smarter thanstr.split()
| |compile()
| Compile pattern for reuse |pattern
,flags=0
| Compiled pattern object | Improves performance with repeated use | |escape()
| Escape special regex chars in input |string
| Escaped string | Use when inserting user input into regex safely | -
Challenge
Real World Examples
Real World Examples
In this step, you will apply what you've learned about core regex syntax, Python’s
re
module, and match objects to solve real-world problems using regular expressions.Real-world regex skills are powerful for validating, extracting, and cleaning data across countless applications!
Regular Expression Syntax Table Quick Reference
| Symbol | Category | Meaning / Description |
|--------------------|-----------------------|----------------------------------------------------------------------------------| |
.
| Wildcard | Matches any single character except newline | |\d
| Character class | Matches any digit (same as[0-9]
) | |\D
| Character class | Matches any non-digit | |\w
| Character class | Matches any word character (letters, digits, underscore) | |\W
| Character class | Matches any non-word character | |\s
| Character class | Matches any whitespace (space, tab, newline) | |\S
| Character class | Matches any non-whitespace character | |[...]
| Character set | Matches any one character inside brackets | |[^...]
| Negated set | Matches any character not inside brackets | ||
| Alternation | OR operator; matches either the left or right pattern | |()
| Grouping | Groups expressions, enables capturing or combining parts | |(?:...)
| Non-capturing group | Groups pattern but doesn't capture it | |(?P<name>...)
| Named group | Captures a group with a name | |\b
| Anchor | Word boundary (between word and non-word character) | |\B
| Anchor | Not a word boundary | |^
| Anchor | Matches the start of the string (or line with multiline flag) | |$
| Anchor | Matches the end of the string (or line with multiline flag) | |*
| Quantifier | Matches 0 or more repetitions | |+
| Quantifier | Matches 1 or more repetitions | |?
| Quantifier | Matches 0 or 1 (makes preceding token optional) | |{n}
| Quantifier | Matches exactly n repetitions | |{n,}
| Quantifier | Matches n or more repetitions | |{n,m}
| Quantifier | Matches between n and m repetitions | |?
after quant. | Lazy modifier | Makes quantifier non-greedy (match as little as possible) | |(?=...)
| Lookahead (positive) | Match if followed by pattern (doesn't include it in result) | |(?!...)
| Lookahead (negative) | Match if not followed by pattern | |(?<=...)
| Lookbehind (positive) | Match if preceded by pattern | |(?<!...)
| Lookbehind (negative) | Match if not preceded by pattern | |\\
| Escape | Escapes a special character (e.g.,\.
matches a literal dot) |<details><summary>`re` Module Methods Quick Reference</summary> | Method | Purpose | Parameters | Returns | Notes |
|----------------|-------------------------------------------|-------------------------------------------|--------------------------|----------------------------------------------------| |
match()
| Match pattern at the start of string |pattern
,string
,flags=0
| Match object orNone
| Good for "does this string start with..." | |search()
| Search anywhere in string |pattern
,string
,flags=0
| Match object orNone
| Finds first occurrence | |fullmatch()
| Match the entire string |pattern
,string
,flags=0
| Match object orNone
| Use for strict validation | |findall()
| Find all non-overlapping matches |pattern
,string
,flags=0
| List of strings or tuples| Usefinditer()
for match objects instead | |finditer()
| Iterate over all matches as objects |pattern
,string
,flags=0
| Iterator of Match objects| Useful for position info, grouping, etc. | |sub()
| Replace pattern with replacement string |pattern
,repl
,string
,count=0
| New string | Use for find-and-replace | |subn()
| Likesub()
, but also returns count |pattern
,repl
,string
,count=0
| Tuple: (string, count) | Great for auditing replacements | |split()
| Split string by pattern |pattern
,string
,maxsplit=0
| List of strings | Smarter thanstr.split()
| |compile()
| Compile pattern for reuse |pattern
,flags=0
| Compiled pattern object | Improves performance with repeated use | |escape()
| Escape special regex chars in input |string
| Escaped string | Use when inserting user input into regex safely |Match Object Methods & Properties Quick Reference
| Property / Method | Description | |---------------------|-----------------------------------------------------------------------------| |
.group()
| Returns the entire match (or a specific group if passed an index) | |.groups()
| Returns a tuple of all captured groups (excluding named groups) | |.groupdict()
| Returns a dictionary of all named capturing groups | |.start()
| Returns the start index of the match | |.end()
| Returns the end index (1 past the last character) of the match | |.span()
| Returns a tuple(start, end)
representing the range of the match | |.pos
| The starting position of the search within the string | |.endpos
| The ending position (limit) of the search | |.re
| The regular expression object used for the match | |.string
| The original string passed tore.search()
or similar | |.lastgroup
| The name of the last matched capturing group | |.lastindex
| The index of the last matched capturing group (by number) |
What's a lab?
Hands-on Labs are real environments created by industry experts to help you learn. These environments help you gain knowledge and experience, practice without compromising your system, test without risk, destroy without fear, and let you learn from your mistakes. Hands-on Labs: practice your skills before delivering in the real world.
Provided environment for hands-on practice
We will provide the credentials and environment necessary for you to practice right within your browser.
Guided walkthrough
Follow along with the author’s guided walkthrough and build something new in your provided environment!
Did you know?
On average, you retain 75% more of your learning if you get time for practice.