Ritchie Vink - Keynote on Polars Plugins

Discover how Polars, a parallel query engine, empowers Rust plugins for query optimization and speed, enabling out-of-core processing and parallel data aggregation.

Key takeaways
  • Not everything benefits from parallelism and speed.
  • Polars is a parallel query engine that runs Rust plugins.
  • Polars takes a query and runs it with optimizations and parallelism to make it fast.
  • The goal of Polars is to make query optimization easy and automatic.
  • Polars uses expressions to perform vectorized operations.
  • Expressions are composable and can be used in all operations.
  • Plugins in Polars are used to optimize and speed up queries.
  • Plugins can be compiled and registered with the Polars engine.
  • Plug-ins provide a way to use multiple cores and get speed-ups.
  • Polars uses a single thread pool.
  • Polars can process a large dataset using out-of-core processing.
  • Polars is designed for out-of-core processing and parallelism.
  • Polars can read data from disk and then process it.
  • Polars can spill data to disk and then read it back in.
  • Polars uses caching to minimize disk I/O.
  • Polars can do data aggregation in parallel.
  • Aggregations can be cumulative.
  • Materialization is the step from query to output result.
  • Polars has a lazy frame data structure for efficient query planning.
  • Polars can recognize and optimize arithmetic and string operations.
  • Polars use borrow checking to ensure memory safety.
  • Polars suffers from the Global Interpreter Lock (GIL) in Python.