LLMs gone wild - Tess Ferrandez-Norlander - NDC Oslo 2024

Learn about key components, challenges, and best practices for RAG systems with LLMs, including data preparation, accuracy metrics, and practical implementation tips.

Key takeaways
  • RAG (Retrieval-Augmented Generation) systems are currently ~75% of LLM applications, helping ground LLM outputs in factual data

  • Key components of successful RAG implementations:

    • Proper data preparation and chunking
    • Selection of appropriate embedding models
    • Effective metadata filtering
    • Guardrails against hallucinations and sensitive data
    • Evaluation frameworks for accuracy
  • LLMs have varying accuracy rates:

    • Generation accuracy: 60-85%
    • Retrieval accuracy: 60-85%
    • Combined RAG system accuracy: ~72%
  • Critical optimization areas:

    • Data ingestion and preparation
    • Chunk size and strategy
    • Context window management
    • Prompt engineering
    • Re-ranking of results
  • Common challenges:

    • Handling sensitive/PII data
    • Maintaining data freshness
    • Dealing with multi-modal content (images, tables)
    • Managing context windows
    • Preventing hallucinations
  • Key evaluation metrics:

    • Context precision
    • Context recall
    • Answer relevancy
    • Factual accuracy
  • Best practices:

    • Implement proper data filtering
    • Use metadata enrichment
    • Test with real users
    • Add guardrails for sensitive use cases
    • Monitor and evaluate system performance
  • Tools and frameworks:

    • LangChain
    • LlamaIndex
    • Semantic Kernel
    • Various vector databases
    • Evaluation frameworks like Ragas