We can't find the internet
Attempting to reconnect
Something went wrong!
Hang in there while we get back on track
Talks - Jodie Burchell: Lies, damned lies and large language models
Explore how LLMs compress data, leading to hallucinations, and learn practical strategies to improve accuracy. Covers RAG, prompt engineering, and evaluation methods.
-
LLMs are essentially doing “lossy compression” of their training data, leading to information loss and potential hallucinations
-
Two main types of hallucinations:
- Faithfulness hallucinations: model deviates from given context
- Factuality hallucinations: model generates incorrect facts
-
Common data quality issues contributing to hallucinations:
- Training data containing misinformation and conspiracy theories
- Low quality sources
- Inadequately filtered web content
- Outdated information
-
Key methods to reduce hallucinations:
- Retrieval Augmented Generation (RAG)
- Better prompt engineering
- Domain-specific datasets
- Self-refinement and collaborative refinement
- Improved data filtering
-
RAG implementation considerations:
- Document chunk size
- Choice of embedding model
- Retrieval method
- Vector database selection
- Prompt construction
-
Model size trends:
- GPT-1: 120 million parameters
- GPT-3: 175 billion parameters
- GPT-4: 1 trillion parameters
- Larger models can encode more information but remain prone to hallucinations
-
Measuring hallucination rates:
- TruthfulQA dataset for factuality
- HaluEvalQA for faithfulness
- Multiple choice vs. open-ended evaluation
- Need for domain-specific evaluation metrics