We can't find the internet
Attempting to reconnect
Something went wrong!
Hang in there while we get back on track
Talks - Irit Katriel: CPython's Compilation Pipeline
Learn how CPython 3.13's compilation pipeline evolves with new stages between AST and bytecode, making the compiler more modular, testable, and maintainable. A deep dive by Irit Katriel.
-
Python’s compilation pipeline in 3.13 introduces new stages between AST and bytecode generation, improving modularity and testability
-
The main compilation stages are:
- Tokenizer (converts source code to tokens)
- Parser (builds AST)
- AST Optimizer
- Code Generation (produces instruction sequence)
- Peephole Optimizer (optimizes pseudocode)
- Assembler (creates final bytecode)
-
The refactoring was primarily motivated by:
- Need for better unit testing capabilities
- Improving code maintainability
- Making the compiler more modular and accessible
-
New pseudo-instructions were introduced as an intermediate representation between AST and bytecode, providing better abstraction
-
The changes make the compiler more flexible for:
- Writing targeted unit tests for specific optimizations
- Hooking in alternative compiler implementations
- Customizing individual compilation stages
-
Key improvements for bytecode handling:
- Better handling of jump instructions
- Cleaner separation between logical jumps and physical offsets
- Improved optimization passes
-
Testing capabilities are exposed through
_test_internal_capi
module, though not yet officially part of the standard library -
The changes simplified compile.c, which was previously one of the largest handwritten files in CPython
-
Future possibilities include exposing compilation stages through the standard library if compelling use cases emerge
-
The refactoring helps make CPython more accessible to contributors by breaking down complex compilation steps