We can't find the internet
Attempting to reconnect
Something went wrong!
Hang in there while we get back on track
Time series anomaly detection with a human-in-the-loop [PyCon DE & PyData Berlin 2024]
Learn how to combine machine learning with domain expertise for time series anomaly detection using Label Studio, Azure ML, and Python-based automation for efficient expert validation.
-
Domain expert knowledge is crucial for time series anomaly detection - algorithms alone aren’t enough without human validation
-
Label Studio serves as the core tool for expert feedback, offering:
- Easy-to-use web interface for anomaly review
- Webhook capabilities for automation
- Support for multiple data formats
- Programmatic interaction options
-
System architecture combines:
- Data ingestion pipeline
- Pre-processing pipeline
- Anomaly detection pipeline
- Azure DevOps for orchestration
- Azure Machine Learning Studio for ML workloads
-
Automated workflow:
- Starts with unsupervised ML to identify potential anomalies
- Presents candidates to domain experts for validation
- Incorporates feedback to improve future detection
- Runs in batches rather than real-time
-
Key implementation goals:
- Minimize expert time investment
- Provide reusable and scalable tooling
- Enable quick iteration on models
- Support flexible choice of methods
- Create labeled datasets for future use cases
-
System design priorities:
- Python-based implementation (~90% Python code)
- Modular architecture for reusability
- Easy-to-use interfaces for domain experts
- Automated infrastructure setup via Terraform
- Integration with existing Azure services
-
Focus on practical industrial applications rather than theoretical approaches, with emphasis on getting value from data quickly rather than spending months on initial data labeling