Hugo Bowne-Anderson - Orchestrating Generative AI Workflows to Deliver Business Value

Learn how to orchestrate generative AI workflows and deliver business value with Pythonic code, handling failures, tracing data and models, and more from Hugo Bowne-Anderson's expert insights.

Key takeaways
  • Pythonic code can be used to write full-stack machine learning workflows, allowing data scientists to focus on modeling while software engineers can focus on infrastructure.
  • Generative AI workflows require orchestration, compute, and data freshness, making data versioning increasingly important.
  • Traditional software engineering approaches may not be applicable to generative AI, as it involves handling failures gracefully, tracing data and models, and versioning code.
  • Large language models (LLMs) tend to hallucinate, and it’s crucial to approach the problem at inference time rather than during training.
  • Augmenting LLMs with relevant data and using retrieval-augmented generation can improve output relevance.
  • Inference tuning, fine-tuning, and retrieval-augmented generation can be used to update large language models with current and relevant data.
  • Metaflow is an open-source framework for building and managing full-stack machine learning workflows.
  • Data freshness is a challenge in generative AI workflows, and scheduling and event-based workflows can help address this issue.
  • Closed-verse open APIs and open-source foundation models offer more control over the supply chain.
  • Quantization and LoRa (Low-Rank Optimization) can be used to reduce model optimization costs.
  • Data engineers and software engineers require different skill sets to work with generative AI.
  • The role of data scientists has evolved, and they must now consider the orchestration story in addition to modeling.
  • The directed acyclic graph (DAG) can be used to visualize the flow of data and models in generative AI workflows.
  • Versioning is crucial in generative AI workflows, including data, models, and code.
  • The stack remains the same in generative AI, but the importance of components like orchestration and data freshness changes.
  • Generative AI workflows involve handling massive amounts of data and computations, requiring robust infrastructure and parallelization.
  • Low-latency APIs are essential for generating AI models.