Hugo Anderson - Orchestrating Generative AI Workflows to Deliver Business Value | PyData Global 2023

Learn practical strategies for building robust generative AI systems, from infrastructure needs to best practices, with real-world solutions for LLM challenges and optimization.

Key takeaways
  • When moving from traditional ML to generative AI, core infrastructure needs remain similar but with increased emphasis on compute resources, versioning, and orchestration

  • Key challenges with LLMs include:

    • Hallucination and accuracy issues
    • Lack of access to fresh data
    • High compute and infrastructure costs
    • Complex deployment requirements
    • Security and supply chain vulnerabilities
  • Solutions for improving LLM performance:

    • Fine-tuning on specific datasets
    • Retrieval-augmented generation (RAG)
    • Regular model updates with fresh data
    • Model optimization techniques like quantization
    • Swappable model architecture for flexibility
  • Infrastructure considerations:

    • Need for robust versioning of code, data, and models
    • Proper orchestration of workflows using DAGs
    • Resource management through decorators and configurations
    • Balance between open source and vendor APIs
    • Kubernetes and cloud deployment capabilities
  • Best practices for generative AI systems:

    • Perform cost-benefit analysis of model choices
    • Consider security implications in the ML supply chain
    • Enable easy model/component swapping
    • Implement proper monitoring and versioning
    • Focus on reducing single points of failure
  • Data scientists should focus on model development while having infrastructure that handles:

    • Computing resource allocation
    • Workflow orchestration
    • Model deployment
    • Data freshness
    • Version control