Tutorials - Matt Harrison: Getting Started with Polars

Learn the basics of Polars, a Rust library that uses Arrow under the hood, and discover how to work with data frames, handle large datasets, and optimize performance with debugging techniques and data processing strategies.

Key takeaways
  • You can work with data frames in pollers, a library that leverages Rust and arrow under the hood.
  • When debugging, use a lazy approach with pollers, as it allows for better performance and optimization.
  • When working with dates, use the date time library with pollers.
  • When doing aggregations, split data into smaller chunks and use chaining to process in parallel.
  • When working with large datasets, use pollers to handle the load and reduce memory requirements.
  • You can use a pipe in pollers to pass in a list of expressions into the group by function.
  • When working with columns, use the describe function to inspect the data and determine the best approach.
  • When doing aggregations, you can use a global variable to update the current state of the data frame.
  • You can use the group by function in pollers to maximize CPU usage and improve performance.
  • When working with data frames, you can use the with columns function to work with reordered columns.
  • When using pollers, you can pass in a regular expression to exclude certain columns from the output.
  • When working with data, you can use the mean and standard deviation functions to summarize the data.
  • When doing plots, you can use pollers to create plots in matplotlib.
  • When working with data frames, you can use the pipe function to pass in a list of expressions into the group by function.
  • When doing aggregations, you can use a lazy approach to improve performance and reduce memory usage.
  • When working with pollers, you can use chaining to process data in parallel and improve performance.
  • When working with data frames, you can use the with columns function to work with reordered columns.
  • When doing plots, you can use pollers to create plots in matplotlib.