Elijah ben Izzy & Stefan Krawczyk - Bridging Classic ML Pipelines with the World of LLMs

Python Testing Automation

Learn how to bridge classic ML and LLM pipelines using DAGs and Hamilton framework. Discover shared patterns, key differences, and best practices for building unified ML systems.

Key takeaways

DAGs (Directed Acyclic Graphs) are a crucial abstraction that can effectively model both traditional ML and LLM pipelines
Hamilton is a micro-orchestration framework that allows defining DAGs using declarative functions, enabling modular, testable, and self-documenting pipelines
LLM and classic ML pipelines share similar structural patterns and engineering challenges:
- Both require observability, evaluation, and productionization
- Both can be modeled as DAGs of computational steps
- Both need proper versioning and testing
Key differences between LLM and classic ML pipelines:
- LLMs typically require GPUs for serving
- LLMs have less feature engineering but more focus on prompt engineering
- LLM evaluation tends to be fuzzier due to text-based outputs
Benefits of using Hamilton for ML/LLM pipelines:
- Easy swapping between components and implementations
- Built-in testing and debugging capabilities
- Code reusability and modularity
- Support for both batch and online implementations
- Integration with existing tools (Airflow, Metaflow, etc.)
The framework emphasizes software engineering best practices:
- Self-contained, modular components
- Clear dependency management
- Easy testing and debugging
- Documentation through code structure
Pipelines can be built combining both ML and LLM components using config.when decorator for flexible switching between implementations
Prompts in LLM pipelines can be treated similar to hyperparameters in traditional ML pipelines
Vector databases and embedding operations work similarly for both LLM and classic ML use cases
The field requires tools that can handle rapid changes and iterations, especially in the LLM space

Elijah ben Izzy & Stefan Krawczyk - Bridging Classic ML Pipelines with the World of LLMs

More talks