Stephen Macke - Python as a Hackable Language for Interactive Data Science | PyData Global 2023

Learn how Python's hackability enables reactive execution in Jupyter notebooks, explores AST transformations, and enhances interactive data science workflows using IPyFlow.

Key takeaways
  • Python’s dominance in data science stems from its extensive ecosystem of data-related libraries and excellent tooling for interactive programming

  • The IPyFlow kernel enables reactive execution in Jupyter notebooks, automatically refreshing dependent cells when variables change without unnecessary recalculations

  • AST (Abstract Syntax Tree) transformations allow Python to be extended with new features like top-level await and optional chaining, making it more convenient for specific use cases

  • Memoization can be used to skip expensive computations when inputs haven’t changed, making notebook interactions appear instantaneous

  • Piccolo library enables composable AST transformations that don’t conflict with each other, unlike traditional AST transformers

  • Interactive widgets (like sliders) can be integrated with reactive execution for dynamic data visualization and educational purposes

  • Out-of-order execution and state management issues in notebooks can be addressed through proper instrumentation and dependency tracking

  • Static code analysis alone isn’t sufficient for determining dependencies in notebooks - runtime introspection is often necessary

  • The combination of reactive execution and memoization can significantly improve the interactive data analysis workflow by reducing wait times

  • Python’s extensibility enables creating tools that enhance the notebook experience while maintaining a clean, intuitive syntax