Lies, damned lies and large language models — Jodie Burchell

Explore types of LLM hallucinations, their evolution through GPT models, and practical methods to reduce false outputs. Learn to measure and mitigate AI inaccuracies.

Key takeaways

Two main types of LLM hallucinations exist:
- Faithfulness hallucinations - deviating from source text/context
- Factuality hallucinations - generating incorrect factual information
GPT model evolution shows increasing capabilities:
- GPT-1 (120M parameters): Basic grammar
- GPT-2: More sophisticated text completion
- GPT-3+: Ability to encode knowledge and generate coherent content
Training data quality significantly impacts hallucination rates:
- Early models relied heavily on unfiltered CommonCrawl data
- Modern approaches use filtered sources (C4, Refined Web)
- Higher quality input data generally leads to better performance
Methods to reduce hallucinations include:
- Careful prompt engineering
- Fine-tuning on specific domains
- Retrieval Augmented Generation (RAG)
- Self-refinement and collaborative refinement
- Using multiple models to cross-validate outputs
Measuring hallucination rates:
- Multiple evaluation datasets exist (TruthfulQA, HALU eval, SQuAD)
- TruthfulQA specifically tests for common misconceptions
- Current models still show significant hallucination rates (~30-40%)
- Measurement methods need to be specific to use case and domain
Large context windows help reduce inconsistencies but don’t eliminate hallucinations
Trade-offs exist between model size, performance, and hallucination rates
Critical evaluation needed when assessing model performance claims

Lies, damned lies and large language models — Jodie Burchell

More talks