Emeli Dral - More like this: monitoring recommender systems in production | PyData Global 2023

Testing

Learn effective strategies for monitoring recommender systems in production, from business KPIs to data quality metrics, with practical tips for A/B testing & issue detection.

Key takeaways

Start with business KPIs (revenue, conversions) as the primary metrics for monitoring recommender systems in production
Implement multiple layers of monitoring:
- Service health metrics (memory usage, response time)
- Data quality and drift monitoring
- Online quality metrics (clicks, views)
- Offline proxy metrics (precision@k, recall@k)
- Beyond accuracy metrics (diversity, novelty, serendipity)
Use proxy metrics for faster issue detection since business metrics like revenue can have delayed feedback
Consider attribution challenges when measuring recommender system impact, especially for long purchase cycles
Implement A/B testing to properly measure economic impact by comparing user segments with/without recommendations
Monitor data quality and drift closely as data pipeline issues can significantly impact model performance
Create synthetic “avatar” users with specific properties for testing and debugging recommendation behavior
Calculate metrics in batch mode to analyze trends and compare against reference data
Establish correlation between offline metrics and online business metrics for better model selection
Use tools like multi-arm bandits to occasionally serve random recommendations to avoid feedback loops

Emeli Dral - More like this: monitoring recommender systems in production | PyData Global 2023

More talks