Why do AI models hallucinate?
Chapters8
Defines hallucinations, why AI can confidently give wrong answers, and examples of how they appear.
Even advanced AI like Claude can hallucinate, but Enthropic explains why and how to spot and curb it with sources and skepticism.
Summary
Jordan from Enthropic explains that despite Claude’s advances, AI hallucinations—confidently false claims—still happen. He notes examples like citing non-existent papers by Jared Kaplan and fabricating statistics, illustrating why users should be skeptical. Claude hallucinates far less now than a year ago, thanks to ongoing reductions, but the problem isn’t solved, and wrong answers can still look convincing. The video covers how these errors arise: AIs learn from vast text and guess the next word, which can go astray on obscure topics or niche facts. Enthropic trains Claude to say “I don’t know” when unsure and conducts thousands of targeted tests to measure hedging versus false confidence. Practical tips are offered for users: ask for sources, verify citations, and cross-reference with trusted sources, especially for critical work. If something seems off, start a new chat to probe and challenge the model’s claims. The message is candid: hallucinations are a persistent challenge across AI, but transparent evaluation and careful checking make AIs more trustworthy over time, with progress shared on Enthropic’s blog and the Anthropic Academy.
Key Takeaways
- Claude’s hallucinations occur when obscure facts or niche topics lack sufficient data, causing the model to guess rather than abstain.
- Enthropic trains Claude to say 'I don’t know' and tests it with thousands of questions designed to trigger uncertainty or falsehoods.
- With every new Claude version, hallucination rates drop, but the issue remains an ongoing, not solved, challenge for the AI field.
- To reduce hallucinations, users should request sources, verify cited materials, and cross-reference with trusted, external sources.
- If unsure, ask the model how confident it is and whether anything might be wrong, or start a new chat to probe for errors and corroborating evidence.
Who Is This For?
Essential viewing for developers and researchers using Claude or other LLMs who need to understand why models make up facts and how to mitigate it in practice.
Notable Quotes
"Hallucinations are hard to anticipate, hard to catch, and the wrong answer often looks exactly like it could be the right one."
—Defines the core challenge of hallucinations and why they are dangerous.
"We regularly test Claude with thousands of questions specifically designed to trip it up."
—Describes the rigorous evaluation process to reduce errors.
"During training, we teach Claude to be honest and to say, 'I don’t know' when it's not sure."
—Shows a core mitigation strategy to improve trustworthiness.
"If you have an answer you're unsure about, start a new chat and ask the AI to find errors in the answer and to confirm that the sources support the statements."
—Practical tip for users to verify information.
"Reducing hallucinations is an important goal to make AIs more trustworthy and useful to everyone."
—Summarizes the overarching objective of their work.
Questions This Video Answers
- How can I verify AI-generated citations and numbers from Claude?
- Why do AI models hallucinate when asked about obscure research papers?
- What practical steps can I take to reduce hallucinations when using LLMs for critical work?
- How does Claude's training approach encourage honesty and uncertainty when facts are unclear?
- What should I do if an AI confidently provides incorrect information?
ClaudeEnthropicAI hallucinationsJared Kaplan papersI don’t knowCitations verificationAI evaluationAnthropic Academy
Full Transcript
If AI is so advanced, why does it sometimes make stuff up? My name is Jordan and I work at Enthropic. We make Claude, an AI assistant, and we do a lot to make sure it gives you accurate information. But sometimes AI still make things up. We call these errors hallucinations, and they're often worse than just making a mistake because the AI will appear very confident or even try to convince you that it's right. Hallucinations can show up in a lot of ways. The AI might cite a research paper that doesn't exist, make up fake statistics, or get facts wrong about real people or real events.
Here's what it looks like. You ask Claude to tell you about some papers written by Jared Kaplan. It confidently gives you answers. None of those titles actually exist. Claude hallucinates much less than even a year ago. Honestly, it took us a while to find an example like this because we've put a lot of work into reducing hallucinations in Claude. But that's kind of the point. Hallucinations are hard to anticipate, hard to catch, and the wrong answer often looks exactly like it could be the right one. And since hallucinations are becoming more rare, people often don't bother to check the AI's work.
So, let's talk about why this happens, what we're doing about it, and how you can catch hallucinations when you use AI. AI assistants like Claude learn by reading huge amounts of text from the internet. They get really good at figuring out what words or ideas typically come next. Kind of like how your phone suggests the next word as you type. This works well most of the time, but when you ask about something obscure, like specific research papers from a relatively unknown researcher, there just isn't enough information for the AI to draw from. So, it tries to be helpful and takes a guess.
And sometimes that guess is wrong. It's a bit like asking a friend who's read every popular book and takes a lot of pride in knowing all the random facts about them. But because they want to seem like the expert, they sometimes say something confidently wrong instead of admitting, "I don't know." AIS are trained to be helpful, so they want to give you some answer even when they're not sure. But we have ways to mitigate this. During training, we teach Claude to be honest and to say, "I don't know." when it's not sure. We try to teach Claude that being honest is both the right thing to do and also part of how to be more helpful.
We regularly test Claude with thousands of questions specifically designed to trip it up. Obscure facts, niche topics, questions where the truthful answer is, "I don't know." We measure things like, "How often does Claude correctly say it's unsure? Does it make up citations or statistics? How often does it hedge appropriately versus stating something false with confidence? These tests help us catch problems and track our progress. With each new version of Claude, we've seen improvements, but we're honest that this is an ongoing challenge for the entire AI field. Not at all a solved problem. If you're wondering how to spot when this happens, hallucinations are most likely to happen in a few types of situations.
For example, if you're asking for specific facts, statistics, or citations, or if the topic is obscure, niche, or very recent, if you're asking about real but not widely known people or places, or when you need exact details like dates, names, or numbers. Here are some tips you can use to reduce hallucinations. First, ask the AI to find sources to back up its claims. And if it already gave sources, ask it to check that those sources actually support what it's saying. Try telling the AI upfront. It's okay if you don't know. And if you're unsure about an answer, ask the AI how confident it is and whether anything might be wrong.
Often, the AI knows it's wrong, but just wanted to sound confident. If you have an answer you're unsure about, start a new chat and ask the AI to find errors in the answer and to confirm that the sources support the statements. For critical work, you should cross reference with trusted sources. Be skeptical and double check specific numbers, dates, and citations. If something sounds off, ask follow-up questions. Reducing hallucinations is an important goal to make AIs more trustworthy and useful to everyone. We'll continue to share our progress in this area on our blog. You can learn about other tools and frameworks for working with AI in the Anthropic Academy.
More from Claude
Get daily recaps from
Claude
AI-powered summaries delivered to your inbox. Save hours every week while staying fully informed.







