Heidrich, Kiraly, & Ray - sktime - python toolbox for time series | PyData Global 2023

Learn about sktime, an open-source Python library for time series analysis. Covers forecasting, classification, pipelines, and integration with popular ML packages.

Key takeaways
  • sktime is an open-source library for time series learning that integrates multiple time series packages and provides unified interfaces similar to scikit-learn

  • The library supports multiple time series tasks including:

    • Forecasting
    • Classification
    • Regression
    • Clustering
    • Annotation
  • Key features introduced in 2022-2023:

    • Graphical pipelines for complex non-sequential workflows
    • Improved parallelization for multivariate and hierarchical data
    • Probabilistic forecasting with distribution objects
    • Benchmarking capabilities
    • Marketplace and deployment features
  • Forecasting interfaces follow scikit-learn patterns with fit/predict methods and support:

    • Exogenous variables
    • Multiple seasonality patterns
    • Automatic parameter tuning
    • Probabilistic predictions with confidence intervals
  • The library provides adaptors to many popular time series packages including:

    • ARIMA
    • Prophet
    • TSFresh
    • PDM
    • TBATS
  • Pipeline capabilities include:

    • Sequential pipelines for simple workflows
    • Graphical pipelines for complex cases with parallel steps
    • Transformation pipelines for preprocessing
    • Composable interfaces across different tasks
  • Strong focus on community and open governance:

    • Permissive license
    • Mentoring program for new contributors
    • Active Discord community
    • Regular developer sprints
  • Extensive support for different data formats:

    • Univariate and multivariate series
    • Hierarchical data
    • Panel data
    • Various time index types
  • Built-in tools for:

    • Model evaluation and validation
    • Parameter tuning
    • Cross-validation
    • Performance metrics
  • Integration capabilities with:

    • MLflow
    • scikit-learn
    • pandas