We can't find the internet
Attempting to reconnect
Something went wrong!
Hang in there while we get back on track
Time series anomaly detection with a human-in-the-loop [PyCon DE & PyData Berlin 2024]
Learn how to combine machine learning with domain expertise for time series anomaly detection using Label Studio, Azure ML, and Python-based automation for efficient expert validation.
- 
    Domain expert knowledge is crucial for time series anomaly detection - algorithms alone aren’t enough without human validation 
- 
    Label Studio serves as the core tool for expert feedback, offering: - Easy-to-use web interface for anomaly review
- Webhook capabilities for automation
- Support for multiple data formats
- Programmatic interaction options
 
- 
    System architecture combines: - Data ingestion pipeline
- Pre-processing pipeline
- Anomaly detection pipeline
- Azure DevOps for orchestration
- Azure Machine Learning Studio for ML workloads
 
- 
    Automated workflow: - Starts with unsupervised ML to identify potential anomalies
- Presents candidates to domain experts for validation
- Incorporates feedback to improve future detection
- Runs in batches rather than real-time
 
- 
    Key implementation goals: - Minimize expert time investment
- Provide reusable and scalable tooling
- Enable quick iteration on models
- Support flexible choice of methods
- Create labeled datasets for future use cases
 
- 
    System design priorities: - Python-based implementation (~90% Python code)
- Modular architecture for reusability
- Easy-to-use interfaces for domain experts
- Automated infrastructure setup via Terraform
- Integration with existing Azure services
 
- 
    Focus on practical industrial applications rather than theoretical approaches, with emphasis on getting value from data quickly rather than spending months on initial data labeling