We can't find the internet
Attempting to reconnect
Something went wrong!
Hang in there while we get back on track
Ritchie Vink - Polars 1.0 and beyond | PyData Amsterdam 2024
Explore Polars 1.0's evolution with Ritchie Vink: GPU acceleration, optimized joins, async runtime, and streaming capabilities for high-performance data processing at scale.
-
Polars 1.0 achieved API stabilization in July 2023, focusing on fewer breaking changes while maintaining high performance for data frame operations
-
New GPU acceleration support through collaboration with NVIDIA Rapids team, offering up to 13x speedups on certain operations while maintaining semantic consistency across CPU/GPU execution
-
Built-in optimizer that analyzes query trees to minimize unnecessary operations and improve execution efficiency before materialization
-
New non-equijoin algorithm implementation claimed to be the fastest available, expanding join capabilities beyond traditional equijoins
-
Custom async runtime development for efficient parallel processing, specifically designed for morsel-driven parallelism and compute-bound workloads
-
Minimal dependency approach - Polars ships as a single binary with almost no required Python dependencies to reduce security risks and binary size
-
Improved I/O performance through better parquet reader implementation and smart caching for CSV/NDJson files
-
New plugin system allowing third-party developers to extend functionality while maintaining native performance
-
Integration of plotting capabilities through Altair backend without compromising core engine focus
-
Development of new streaming engine to handle data that doesn’t fit in memory, with focus on maintaining consistent API semantics