Hugo Bowne-Anderson - Full-stack Machine Learning and Generative AI for Data Scientists

Learn how to build end-to-end ML systems with Metaflow, from local prototyping to production deployment. Covers infrastructure, versioning, RAG, and security best practices.

Key takeaways

Full-stack machine learning requires robust infrastructure, including data management, compute, orchestration, versioning, deployment and modeling layers
Production ML systems need common tooling and infrastructure for coordinated development and reliable execution without human supervision
Metaflow helps transition between prototyping and production by allowing code to be developed locally then deployed to cloud/Kubernetes with minimal changes
When building ML systems, versioning needs to cover not just code but also data, models, artifacts and experiment results to ensure reproducibility
Retrieval Augmented Generation (RAG) can improve LLM responses by providing relevant context from your own documentation/data sources
Moving to production is not binary but a graduated process - start with notebooks, add versioning, scale compute, automate deployment etc.
Infrastructure requirements increase as systems become more complex - from local development to cloud workstations, Kubernetes clusters, schedulers etc.
Visualization and reporting through custom cards/dashboards helps track experiments and communicate results to stakeholders
Parallel processing and branching allows efficient execution of complex ML workflows
Important to separate machine learning code, business logic and infrastructure code for maintainability
Security considerations are critical - use environment variables and proper secrets management, not hardcoded credentials
Production deployment can take many forms beyond just REST APIs - batch inference, scheduled reports, event-triggered workflows etc.

Hugo Bowne-Anderson - Full-stack Machine Learning and Generative AI for Data Scientists

More talks