We can't find the internet
Attempting to reconnect
Something went wrong!
Hang in there while we get back on track
Select ML from Databases [PyCon DE & PyData Berlin 2024]
Learn how to integrate machine learning directly into databases, from built-in ML modules to custom implementations. Explore benefits, challenges & real-world use cases.
-
ML functionality is increasingly being integrated directly into databases, allowing analysis and model training on data where it lives
-
Three main approaches for ML in databases:
- Built-in ML modules (limited flexibility but easy to use)
- Third-party integrations (medium flexibility)
- Custom ML methods (full flexibility but more complex)
-
Benefits of database-integrated ML:
- No ETL jobs needed
- Reduced infrastructure complexity
- Better security (data doesn’t leave database)
- Simplified deployment without extra microservices
- Individual service scaling capabilities
-
Key considerations for implementation:
- Model development can be done offline/locally
- Models need to be packaged with dependencies
- SQL queries can be used for model inference
- Performance monitoring at query level is important
- User-defined functions (UDFs) enable custom ML integration
-
Common use cases:
- Churn prediction
- Insurance quote estimation
- Anomaly detection
- Time series analysis
- Personalized recommendations
-
Limitations and challenges:
- Built-in models may not fit all use cases
- Need to balance flexibility vs ease of use
- Model explainability varies by approach
- Performance impact on database operations
- Model lifecycle management requirements