We can't find the internet
Attempting to reconnect
Something went wrong!
Hang in there while we get back on track
Sean Law - STUMPY: Modern Time Series Analysis with Matrix Profiles | SciPy 2024
Learn how STUMPY, a Python library for time series analysis, uses matrix profiles to efficiently find patterns and anomalies in large datasets without prior training.
-
Matrix profiles provide a powerful way to analyze time series data by finding similar patterns and anomalies without requiring prior knowledge or training data
-
STUMPY is a Python library that implements matrix profiles with high performance, supporting multi-CPU/GPU processing, streaming data, and distributed computing via Dask
-
Core capabilities include:
- Finding exact nearest neighbors and motifs in time series data
- Detecting anomalies and conserved behaviors
- Supporting multidimensional time series analysis
- Providing pan-matrix profiles for variable-length pattern matching
-
Key advantages:
- User-friendly API requiring minimal parameters
- Highly interpretable results based on Euclidean distance
- Scalable to large datasets (50M+ points)
- No need for data preprocessing like detrending
- 100% test coverage and battle-tested in production
-
Technical details:
- Uses sliding window Euclidean distance calculations
- Leverages FFT and computation reuse for efficiency
- Supports z-normalization of subsequences
- Minimal dependencies (NumPy, SciPy, Numba)
- Recent 15-20% performance improvements
-
Common use cases:
- Pattern discovery and motif detection
- Anomaly detection
- Time series joins and comparisons
- Clustering with matrix profile distance
- Exploratory data analysis
-
Active open source project with:
- 9 million+ downloads
- 3,000+ GitHub stars
- Regular releases and updates
- Support for latest Python/NumPy versions
- Extensive documentation and tutorials