Marco Gorelli - Polars and time zones: everything you need to know | PyData Global 2023

Learn how to handle time zones effectively in Polars: from DST transitions to timezone conversions. Practical tips for working with datetime data in data analysis workflows.

Key takeaways
  • When dealing with time zones, use specialized libraries like Polars rather than handling them manually due to complexities with DST, time zone changes, and cross-country differences

  • Datetime values have two key components: time unit (smallest representable time increment) and time zone

  • The difference between calendar duration (1d) and fixed duration (24h) is important - a calendar day can be 23, 24, or 25 hours due to DST changes

  • Store datetimes in UTC for consistency, but perform logic in local time zones when working with DST-affected data

  • Convert_timezone vs replace_timezone:

    • Convert answers “what time is it now in another location?”
    • Replace changes the zone while keeping the same local time
  • Polars handles time zones by deferring to the Rust ChronoTZ library, offering better post-2038 support compared to pandas’ PYTZ

  • When reading timezone-aware data from CSVs in Polars, it converts everything to UTC when strings contain UTC offsets

  • Time zone boundaries and day lengths vary by location due to DST transitions happening at different dates globally

  • Using PyArrow strings instead of regular strings improves performance when working with datetime data

  • For datetime interpolation and manipulation, Polars offers expressions and the .dt namespace with timezone-aware operations