TechLead
Lesson 6 of 25
5 min read
Python

String Manipulation

Master Python string methods, formatting, regular expressions, and text processing techniques

String Methods

Python strings come with a rich set of built-in methods for searching, transforming, splitting, and joining text. Since strings are immutable, every method returns a new string rather than modifying the original. Mastering these methods is essential for any Python developer.

# Case methods
text = "Hello, World!"
print(text.upper())       # "HELLO, WORLD!"
print(text.lower())       # "hello, world!"
print(text.title())       # "Hello, World!"
print(text.capitalize())  # "Hello, world!"
print(text.swapcase())    # "hELLO, wORLD!"
print("hello".islower())  # True
print("HELLO".isupper())  # True

# Search methods
s = "Python is great and Python is fun"
print(s.find("Python"))      # 0 (first occurrence)
print(s.rfind("Python"))     # 22 (last occurrence)
print(s.find("Java"))        # -1 (not found)
print(s.index("Python"))     # 0 (like find but raises ValueError if not found)
print(s.count("Python"))     # 2
print(s.startswith("Python"))  # True
print(s.endswith("fun"))       # True

# Strip methods (remove whitespace or specified characters)
padded = "   Hello, World!   "
print(padded.strip())      # "Hello, World!"
print(padded.lstrip())     # "Hello, World!   "
print(padded.rstrip())     # "   Hello, World!"
print("###Hello###".strip("#"))  # "Hello"

# Replace and translate
text = "Hello, World!"
print(text.replace("World", "Python"))  # "Hello, Python!"
print(text.replace("l", "L", 2))        # "HeLLo, World!" (max 2 replacements)

Splitting and Joining

The split() and join() methods are among the most frequently used string operations. They convert between strings and lists, which is essential for parsing data, building output, and processing text.

# split() - string to list
csv_line = "Alice,30,Engineer,NYC"
fields = csv_line.split(",")
print(fields)  # ['Alice', '30', 'Engineer', 'NYC']

text = "Hello   World"
print(text.split())       # ['Hello', 'World'] (splits on any whitespace)
print(text.split(" "))    # ['Hello', '', '', 'World'] (exact split)

# split with maxsplit
log = "ERROR: 2026-01-15: Something went wrong: details here"
parts = log.split(": ", maxsplit=2)
print(parts)  # ['ERROR', '2026-01-15', 'Something went wrong: details here']

# splitlines()
multiline = "Line 1
Line 2
Line 3"
lines = multiline.splitlines()
print(lines)  # ['Line 1', 'Line 2', 'Line 3']

# join() - list to string
words = ["Python", "is", "awesome"]
sentence = " ".join(words)
print(sentence)  # "Python is awesome"

# Join with different separators
path = "/".join(["home", "user", "documents"])
csv = ",".join(["Alice", "30", "NYC"])
html = "
".join(["Line 1", "Line 2", "Line 3"]) # Practical: clean and normalize text raw = " Hello, World! This is Python " cleaned = " ".join(raw.split()) print(cleaned) # "Hello, World! This is Python"

String Formatting

Python offers several ways to format strings. f-strings (Python 3.6+) are the recommended approach for most cases due to their readability, performance, and expressiveness.

# f-strings (recommended)
name = "Alice"
age = 30
print(f"Hello, {name}! You are {age} years old.")
print(f"In 5 years you'll be {age + 5}.")
print(f"Your name in uppercase: {name.upper()}")

# f-string format specifiers
pi = 3.14159265
print(f"Pi: {pi:.2f}")           # "Pi: 3.14"
print(f"Pi: {pi:10.2f}")         # "Pi:       3.14" (width 10)
print(f"Percentage: {0.856:.1%}") # "Percentage: 85.6%"

big_number = 1234567890
print(f"Number: {big_number:,}")  # "Number: 1,234,567,890"
print(f"Number: {big_number:_}")  # "Number: 1_234_567_890"

# Padding and alignment
text = "Hello"
print(f"{text:<20}")   # "Hello               " (left)
print(f"{text:>20}")   # "               Hello" (right)
print(f"{text:^20}")   # "       Hello        " (center)
print(f"{text:*^20}")  # "*******Hello********" (fill char)

# Debug format (Python 3.8+)
x = 42
print(f"{x = }")       # "x = 42"
print(f"{x * 2 = }")   # "x * 2 = 84"

# str.format() method (older approach)
print("Hello, {}! You are {} years old.".format(name, age))
print("Hello, {name}! You are {age} years old.".format(name=name, age=age))

Regular Expressions

The re module provides powerful pattern matching and text manipulation using regular expressions. While string methods handle simple cases, regex is indispensable for complex pattern matching, validation, and text extraction.

import re

# Basic matching
text = "My phone number is 555-123-4567"
match = re.search(r'\d{3}-\d{3}-\d{4}', text)
if match:
    print(match.group())  # "555-123-4567"

# findall - find all matches
text = "Emails: alice@example.com and bob@test.org"
emails = re.findall(r'[\w.]+@[\w.]+', text)
print(emails)  # ['alice@example.com', 'bob@test.org']

# sub - replace patterns
text = "Call 555-123-4567 or 555-987-6543"
redacted = re.sub(r'\d{3}-\d{3}-\d{4}', 'XXX-XXX-XXXX', text)
print(redacted)  # "Call XXX-XXX-XXXX or XXX-XXX-XXXX"

# Groups - extract parts of a match
pattern = r'(\w+)@(\w+)\.(\w+)'
match = re.search(pattern, "alice@example.com")
if match:
    print(match.group(0))  # "alice@example.com"
    print(match.group(1))  # "alice"
    print(match.group(2))  # "example"

# Named groups
pattern = r'(?P\d{4})-(?P\d{2})-(?P\d{2})'
match = re.search(pattern, "Date: 2026-01-15")
if match:
    print(match.group("year"))   # "2026"
    print(match.group("month"))  # "01"

# Compile for reuse (better performance)
email_pattern = re.compile(r'^[\w.+-]+@[\w-]+\.[\w.]+$')
print(email_pattern.match("alice@example.com") is not None)  # True
print(email_pattern.match("not-an-email") is not None)       # False

Key Takeaways

  • f-strings are preferred: Use them for all string formatting needs
  • split() and join(): Master these for text parsing and construction
  • Regex for complex patterns: Use re module when string methods are not enough
  • Strings are immutable: Methods always return new strings

Continue Learning