RubyConf 2023 - Get your Data prod ready, Fast, with Ruby Polars! by Paul Reece

Learn how to get your data production-ready fast with Ruby Polars, a powerful library for data frames, and discover how to handle missing values, filter, and join data frames, and more.

Key takeaways
  • Pollers is a powerful Ruby library for working with data frames, allowing for efficient data cleaning and transformation.
  • Ruby is a great language for data science and analysis, offering many benefits such as ease of use, flexibility, and performance.
  • When working with data frames in Ruby, it’s essential to deal with missing values, as they can cause issues in the analysis and modeling process.
  • Pollers provides various methods for handling missing values, including drop_nulls, drop_flags, and fill_null.
  • The get_column and select_column methods can be used to extract specific columns from a data frame.
  • Data frames can be converted to other data structures, such as Ruby hashes or arrays, using methods like to_h and to_a.
  • Series are a key concept in Pollers, representing a one-dimensional data structure that can be used to represent a column of data.
  • Data frames can be filtered using methods like filter and drop_nulls, which allow for more efficient data manipulation.
  • Joining data frames can be achieved using methods like join and concat, which support various join strategies, such as inner, outer, and left joins.
  • Pollers offers many visualization tools, including plots, bar charts, and histograms, which can be used to analyze and visualize data.
  • Ruby is a great language for working with data, offering many benefits such as ease of use, flexibility, and performance.