Philip Meier - From RAGs to riches: Build an AI document interrogation app in 30 mins

Python

Learn how to build an AI document Q&A system using RAGNA, an open-source RAG framework. Explore vector databases, LLM integration, and efficient data retrieval techniques.

Key takeaways

RAG (Retrieval Augmented Generation) enables LLMs to access up-to-date knowledge through a two-stage process of document embedding and contextual retrieval
RAGNA is an open-source framework by Quansight that provides:
- Python API
- REST API
- Web UI
- Flexible architecture for custom components
Core RAG workflow:
- Documents are chunked and embedded into vectors
- Stored in vector database (Chroma, LensDB supported)
- Similar chunks retrieved based on query similarity
- Only relevant context sent to LLM for response
Two main approaches for LLMs to handle new data:
- Fine-tuning (long-term memory, costly and time-consuming)
- RAG (short-term memory, more efficient for specific queries)
RAGNA features:
- Async support for better performance
- Source tracking to verify responses
- Configurable context windows
- Multiple LLM support (Anthropic, MosaicML, OpenAI)
- Customizable components
Implementation considerations:
- Fixed context size limitations
- Importance of chunking strategy
- Hallucination prevention through source verification
- Token management for efficient processing
Positioned between experimental frameworks (like LangChain) and fully integrated solutions (like ChatGPT) to provide:
- Better user experience
- Configuration flexibility
- Custom deployment options

Philip Meier - From RAGs to riches: Build an AI document interrogation app in 30 mins

More talks