Python Code Review Checklist: Pythonic Code and Best Practices

Published on
Written byChristoffer Artmann
Python Code Review Checklist: Pythonic Code and Best Practices

Mutable default arguments cause bizarre bugs where function calls affect each other. Catching broad exceptions hides real errors. Reference semantics create unexpected mutations when copying lists. Silent exceptions swallowed in bare except blocks make debugging impossible. These issues run without errors until specific conditions reveal the problems.

This checklist helps catch what matters: mutable default bugs, exception handling mistakes, reference vs copy errors, and code that fights Python idioms instead of embracing Pythonic patterns.

Quick Reference Checklist

Use this checklist as a quick reference during code review. Each item links to detailed explanations below.

Pythonic Patterns

  • List comprehensions for simple transformations: [x * 2 for x in numbers]
  • Generator expressions for large sequences: (x * 2 for x in numbers)
  • Dictionary comprehensions: {k: v for k, v in items}
  • enumerate() for index and value: for i, value in enumerate(items)
  • zip() for parallel iteration: for x, y in zip(list1, list2)
  • Context managers with for resources: with open(file) as f:
  • Decorators for cross-cutting concerns: @property, @staticmethod
  • __str__ and __repr__ defined for custom classes

Type Hints & Annotations

  • Function signatures include type hints: def greet(name: str) -> str:
  • Complex types use typing module: List[str], Dict[str, int]
  • Optional types explicit: Optional[str] instead of implicit None
  • Union types for alternatives: Union[int, str]
  • Generic types parameterized: List[User] not just List
  • Return types documented even for None: -> None
  • mypy passes without errors (when using type hints)
  • Protocol types for structural typing (Python 3.8+)

Error Handling

  • Specific exceptions caught, not bare except:
  • Exception messages provide context: what failed and why
  • try blocks minimal (only code that might fail)
  • Resources cleaned up in finally or with context managers
  • Custom exceptions inherit from appropriate base
  • EAFP preferred over LBYL: try/except over if checks
  • Exceptions not used for control flow
  • raise from preserves exception chain: raise NewError() from e

Resource Management

  • Files opened with context manager: with open() as f:
  • Database connections use context managers
  • No manual close() calls (context manager handles it)
  • __enter__ and __exit__ implemented for custom context managers
  • Thread locks acquired with with lock:
  • Network connections properly closed
  • Temporary files use tempfile module
  • Large files processed in chunks, not loaded entirely

Collections & Data Structures

  • Default dict values use dict.get(key, default) or defaultdict
  • List methods used appropriately: append() vs extend()
  • Set operations for membership and uniqueness
  • Tuples for immutable sequences
  • Named tuples or dataclasses for structured data
  • deque for queues, not lists
  • Counter for counting, not manual dict manipulation
  • Shallow vs deep copy understood: list.copy() vs copy.deepcopy()

Standard Library Usage

  • pathlib.Path for file paths, not string manipulation
  • datetime for date/time, not manual parsing
  • json module for JSON, not manual string building
  • re module for regex, compiled for repeated use
  • itertools for advanced iteration patterns
  • functools for functional programming helpers
  • collections for specialized containers
  • logging for logs, not print() statements

Pythonic Patterns and Idioms

List comprehensions express transformations concisely. Instead of building lists with loops and append, [x * 2 for x in numbers] creates a new list directly. This reads as "double each number," making intent clear. List comprehensions work for any iterable and can include conditions: [x for x in numbers if x > 0] filters and transforms in one expression. When we see explicit loops building lists, comprehensions often improve clarity.

Generator expressions provide memory-efficient iteration for large sequences. The syntax (x * 2 for x in numbers) creates an iterator that produces values on demand rather than building a complete list. This matters for large datasets or infinite sequences. When a list is only iterated once, generator expressions use less memory. The distinction between [] for lists and () for generators is subtle but important.

Enumerate provides both index and value when looping. Instead of for i in range(len(items)): item = items[i], we write for i, item in enumerate(items). This eliminates manual indexing and makes the loop's intent clear. When we see range(len()) in loops, enumerate improves the code.

Context managers using with ensure cleanup happens. The pattern with open(filename) as f: guarantees file closure even if exceptions occur. This beats manual try/finally blocks. Context managers work for any resource— locks, database connections, network sockets. When we see manual resource management, context managers simplify the code.

Decorators add functionality to functions without modifying them. @property makes methods accessible as attributes. @staticmethod marks methods that don't need instance access. @lru_cache adds memoization. When cross-cutting concerns like logging, timing, or access control appear, decorators provide clean separation.

Type Hints and Static Typing

Type hints document function contracts without affecting runtime behavior. The signature def greet(name: str) -> str: specifies that greet accepts a string and returns a string. This helps readers understand intent and enables static checking with mypy. Type hints prove most valuable in public APIs, complex functions, and code that interacts with external systems.

The typing module provides types for complex structures. List[str] specifies a list of strings, not just any list. Dict[str, int] documents both key and value types. Generic types make collections type-safe. When we see unparameterized generic types like List or Dict, specifying content types improves documentation.

Optional represents values that might be None. Optional[str] is equivalent to Union[str, None] but reads more clearly. This makes None explicit in the type system rather than assuming all types might be None. When functions might return None, Optional documents this in the signature rather than just the docstring.

Union types specify alternatives. Union[int, str] means a value could be either int or str. This documents actual usage when values legitimately have multiple types. When reviewing Union types, we verify the alternatives make sense and aren't masking design problems.

Mypy checks type consistency across codebases. When projects use type hints, running mypy catches type mismatches that would cause runtime errors. Gradual typing allows adding hints incrementally without forcing full coverage. When we see type hints in code, mypy integration ensures they're actually enforced.

Exception Handling Philosophy

Specific exceptions communicate what went wrong. Catching except ValueError: handles value errors specifically. Catching bare except: catches everything including KeyboardInterrupt and SystemExit, which usually isn't intended. When we see bare except blocks, we verify they're necessary or suggest specific exception types.

Exception messages should explain what failed and provide context. The message "Failed to load config" is better than "Error" but "Failed to load config from /etc/app.conf: file not found" provides actionable information. When we see exceptions with generic messages, more specific messages improve debugging.

Try blocks should contain minimal code—only what might raise the exception we're catching. Large try blocks catch exceptions from unexpected places, masking bugs. When reviewing exception handling, we verify try blocks contain only the code that might throw the specific exceptions caught.

EAFP (Easier to Ask Forgiveness than Permission) is Pythonic. Instead of checking if a key exists before accessing it, we try accessing and catch KeyError. The pattern try: value = dict[key] except KeyError: performs better than if key in dict: value = dict[key] and handles the common case cleanly. When we see excessive existence checks, EAFP might simplify the code.

Custom exceptions provide domain-specific error handling. Inheriting from appropriate base classes (ValueError for value errors, TypeError for type errors) integrates custom exceptions with Python's exception hierarchy. When business rules fail, custom exceptions communicate the specific failure better than generic exceptions.

raise from preserves exception context when raising new exceptions. The pattern raise ConfigError("Invalid format") from e keeps the original exception in the chain, making debugging easier. When we see raised exceptions that lose context, raise from preserves valuable information.

Resource and Memory Management

Files must be opened with context managers. The pattern with open(filename) as f: ensures closure even when exceptions occur. Manual close() calls in finally blocks are error-prone and verbose. When we see file operations without context managers, that's a resource leak waiting to happen.

Database connections, network sockets, and thread locks all benefit from context managers. Any resource requiring cleanup should use with. The pattern applies to standard library resources and custom resources implementing __enter__ and __exit__.

Custom context managers handle resource lifecycles cleanly. The methods __enter__ and __exit__ define setup and teardown. When we see classes managing resources without context manager support, implementing these methods improves the API.

Large files processed in chunks prevent memory problems. Reading entire files with read() fails when files exceed available memory. Iteration for line in file: processes line by line. The iter function and generators enable chunk processing for any large dataset.

Shallow vs deep copy determines whether nested objects are shared. list.copy() or dict.copy() creates a new container but shares elements. copy.deepcopy() recursively copies everything. When reviewing code that copies data structures, we verify the chosen copy method matches the intent. Unexpected sharing of mutable nested objects causes hard-to-find bugs.

Collections and Data Structures

Default dictionary values use get() or defaultdict. Instead of if key in dict: value = dict[key] else: value = default, we write value = dict.get(key, default). For complex default logic, defaultdict(factory) automatically creates missing values. When we see manual existence checks before dictionary access, these patterns simplify the code.

List methods serve different purposes. append() adds one element, extend() adds multiple. The distinction matters when adding sequences— list.extend([1, 2]) adds two elements while list.append([1, 2]) adds one element (the list itself). When we see append in loops adding multiple items, extend or list concatenation might be clearer.

Sets provide O(1) membership testing and automatic uniqueness. When code checks if values exist in lists repeatedly, converting to set improves performance. Set operations like union, intersection, and difference express intent clearly when working with groups of items.

Named tuples or dataclasses structure related values. Instead of tuples with positional access, Point = namedtuple('Point', ['x', 'y']) or @dataclass class Point: creates types with named fields. This improves readability and catches errors when field positions change.

deque provides O(1) operations at both ends, unlike lists where insertion at the start requires shifting all elements. When implementing queues or need efficient operations at both ends, deque performs better than lists. When we see lists used as queues with pop(0), deque improves performance.

Counter from collections counts occurrences. Instead of manual dictionary building, Counter(items) produces counts directly. Methods like most_common() provide common operations. When we see loops counting items into dictionaries, Counter simplifies the code.

Standard Library Capabilities

Pathlib provides object-oriented path manipulation. Instead of string operations and os.path functions, Path(filename).read_text() reads files, path.glob('*.txt') finds files, path.parent accesses directories. When we see string concatenation for paths or extensive os.path usage, pathlib improves readability.

Datetime handles temporal operations correctly including timezones. Manual date parsing with string operations creates bugs around edge cases. The datetime module handles these correctly. When we see string manipulation for dates, datetime provides tested solutions.

JSON module handles serialization safely. Building JSON with string concatenation creates injection vulnerabilities and encoding problems. json.dumps() handles encoding correctly. When we see manual JSON string building, the json module is safer.

Regular expressions compiled once and reused perform better than recreating patterns on every use. pattern = re.compile(r'...') followed by pattern.match() avoids recompiling. When regex operations happen in loops, compilation should happen once outside the loop.

Itertools provides advanced iteration patterns like combinations, permutations, and grouping. Manual implementations of these patterns are error-prone and verbose. When complex iteration appears, itertools often has a solution.

Functools provides functional programming helpers. lru_cache memoizes function results. partial creates new functions with some arguments pre-filled. These patterns eliminate manual memoization and adapter functions.

Logging module provides structured logging with levels and configuration. Print statements in production code prevent controlling output and don't integrate with logging infrastructure. When we see print() in non-script code, logging provides better control.

Bringing It Together

Effective Python code review balances Pythonic idioms with practical concerns. Python's philosophy emphasizes readability and explicitness. Code that fights Python's features becomes harder to maintain. When reviewing Python, we verify the code uses appropriate language features and standard library capabilities.

Not all issues carry equal weight. Mutable default arguments cause real bugs and should be fixed. A loop that could be a comprehension might not need changing if the existing code is clear. The key is distinguishing issues affecting correctness from those representing style preferences.

Python continues evolving with yearly releases adding features like structural pattern matching, the walrus operator, and improved type hints. Teams benefit from adopting features that solve real problems. Code review creates opportunities to share knowledge about Python's capabilities and discuss when modern features improve code clarity and correctness.