How to Make Smart Architecture Decisions when Building Gen AI Apps • Gillian Armstrong • GOTO 2024

Learn key architectural principles for building secure and effective GenAI applications, including RAG patterns, validation layers, security guardrails, and crucial design tradeoffs.

Key takeaways
  • Generative AI doesn’t reduce architectural concerns - it increases them, requiring careful consideration of security, operations, and system design

  • Never trust the model output (“never trust a genie”) - implement multiple layers of validation, access controls, and guardrails around model responses

  • Retrieval Augmented Generation (RAG) is recommended over pure LLM responses for enterprise applications to maintain control over knowledge and sources

  • Consider three key tradeoffs when building GenAI systems:

    • Speed vs accuracy
    • Cost vs capabilities
    • Safety vs convenience
  • Implement proper prompt engineering and validation at multiple levels:

    • Input validation before the model
    • Output validation after the model
    • Context and knowledge base validation
  • Monitor and measure model performance using metrics like:

    • Context relevance
    • Answer accuracy
    • Response faithfulness
    • User satisfaction
  • Choose models based on specific use case requirements rather than pursuing the largest/most capable option by default

  • Keep user input and model outputs outside trust boundaries - treat them like any other untrusted data source

  • Protect against prompt injection and other AI-specific security concerns through proper architecture and guardrails

  • Focus on solving problems where AI provides unique value rather than applying it unnecessarily to simple use cases that could be solved conventionally