We can't find the internet
Attempting to reconnect
Something went wrong!
Hang in there while we get back on track
Lies, damned lies and large language models — Jodie Burchell
Explore types of LLM hallucinations, their evolution through GPT models, and practical methods to reduce false outputs. Learn to measure and mitigate AI inaccuracies.
- 
    Two main types of LLM hallucinations exist: - Faithfulness hallucinations - deviating from source text/context
- Factuality hallucinations - generating incorrect factual information
 
- 
    GPT model evolution shows increasing capabilities: - GPT-1 (120M parameters): Basic grammar
- GPT-2: More sophisticated text completion
- GPT-3+: Ability to encode knowledge and generate coherent content
 
- 
    Training data quality significantly impacts hallucination rates: - Early models relied heavily on unfiltered CommonCrawl data
- Modern approaches use filtered sources (C4, Refined Web)
- Higher quality input data generally leads to better performance
 
- 
    Methods to reduce hallucinations include: - Careful prompt engineering
- Fine-tuning on specific domains
- Retrieval Augmented Generation (RAG)
- Self-refinement and collaborative refinement
- Using multiple models to cross-validate outputs
 
- 
    Measuring hallucination rates: - Multiple evaluation datasets exist (TruthfulQA, HALU eval, SQuAD)
- TruthfulQA specifically tests for common misconceptions
- Current models still show significant hallucination rates (~30-40%)
- Measurement methods need to be specific to use case and domain
 
- 
    Large context windows help reduce inconsistencies but don’t eliminate hallucinations 
- 
    Trade-offs exist between model size, performance, and hallucination rates 
- 
    Critical evaluation needed when assessing model performance claims