Why Chatbots Still Hallucinate – and How OpenAI Wants to Fix It - UC Today

Welcome to the forefront of conversational AI as we explore the fascinating world of AI chatbots in our dedicated blog series. Discover the latest advancements, applications, and strategies that propel the evolution of chatbot technology. From enhancing customer interactions to streamlining business processes, these articles delve into the innovative ways artificial intelligence is shaping the landscape of automated conversational agents. Whether you’re a business owner, developer, or simply intrigued by the future of interactive technology, join us on this journey to unravel the transformative power and endless possibilities of AI chatbots.
Home → AI
Why do AI assistants confidently give wrong answers? The problem isn’t the code, it’s the tests that reward guessing over admitting uncertainty.
Published: September 8, 2025
Christopher Carey
Like students guessing on a tough exam, AI chatbots often bluff when they don’t know the answer.
The result? Plausible sounding but completely false statements – what researchers call hallucinations – that can mislead users and undermine trust.
Despite steady progress in AI, these hallucinations remain stubbornly present in even the most advanced systems, including GPT-5.
Now, a new paper from OpenAI argues that hallucinations are not strange side effects of machine intelligence, but predictable statistical errors built into how large language models (LLMs) are trained – and, crucially, how they’re tested.
Fixing them, the researchers say, will require a rethink of the benchmarks that drive AI development.
At their core, language models are probability machines. They don’t “know” truth from falsehood the way humans do.
Instead, they predict which words are most likely to follow others based on patterns in their training data.
For example, when asked about the title of paper co-author Adam Tauman Kalai’s Ph.D. dissertation, a “widely used chatbot” confidently gave three different answers. All wrong.
The researchers then asked about his birthday, and got three more answers. Also all wrong.
OpenAI formalises this problem with what it calls the Is-It-Valid (IIV) test. In essence, it reduces text generation to a binary classification problem: is a given string valid or invalid?
The maths shows that if a model struggles with this classification, it will necessarily produce hallucinations during generation.
“The model sees only positive examples of fluent language and must approximate the overall distribution,” the researchers explained.
“All base models will err on inherently unlearnable facts. For each person there are 364 times more incorrect birthday claims than correct ones.”
For “arbitrary facts” with no learnable patterns, the error rate bottoms out at a stubbornly high level.
In short: hallucinations aren’t bugs, they’re essentially baked into the statistical foundations of language modelling.
The paper argues that post-training – the fine-tuning process where models are adjusted with human feedback – often makes hallucinations worse because of how success is measured.
Much like students on a test, leaving a question unanswered guarantees failure, whereas guessing opens a chance to earn points.
Language models face the same incentives when evaluated on accuracy-based benchmarks. Saying “I don’t know” is penalised as much as being wrong, while guessing might look correct.
OpenAI calls this an “epidemic of penalising uncertainty.” Imagine two models:
On today’s benchmarks, Model B will outperform Model A – not because it’s more accurate overall, but because the scoring system rewards boldness over caution.
Over time, this encourages models to “learn” to hallucinate.
This evaluation misalignment isn’t a minor quirk, as benchmarks are the lifeblood of AI research.
Leaderboards showcasing model performance on accuracy-driven tests shape funding, competition, and deployment.
But under current norms, accuracy is a binary: right or wrong.
There’s no room for “uncertain,” or any credit for saying “I don’t know,” and no penalty for confidently spewing falsehoods.
That misalignment means even well-intentioned efforts to curb hallucinations are fighting against the grain.
The researchers warn that bolting on a few extra hallucination evaluations won’t be enough. The dominant benchmarks – the ones that define who’s “winning” in AI – need to change if we want systems that prioritize trustworthiness over test-taking.
OpenAI suggests one solution is to overhaul the way models are graded. Just as some tests penalise wrong answers to discourage blind guessing, AI benchmarks should:
This isn’t just about making chatbots less annoying.
In high-stakes domains like medicine, law, or education, confidently wrong answers can have serious consequences. A system that bluffs less and signals uncertainty more could be far safer – even if it looks less impressive on traditional benchmarks.
For leaders in UC, the research underscores a critical operational challenge: AI-powered chatbots and assistants may confidently provide wrong information, potentially affecting customer interactions, internal collaboration, or automated workflows.
Because current evaluation metrics reward guessing over caution, UC systems that integrate AI could inadvertently propagate errors, reduce user trust, or trigger compliance risks.
As the paper notes, “I Don’t Know (IDK)-type responses are maximally penalised while an overconfident ‘best guess’ is optimal.”
UC leaders should therefore prioritise AI deployments that signal uncertainty, provide partial responses when unsure, and incorporate human oversight to prevent overconfident hallucinations from impacting business decisions.
Hallucinations may never vanish – the maths guarantees that some level of error is inevitable when machines are trained to mimic the messy distribution of human knowledge.
But OpenAI’s researchers argue that we can make them less harmful by changing the incentives.
Right now, AI is like a student trained to maximize test scores by guessing whenever it’s unsure. If we want models that admit uncertainty – and in doing so, become more trustworthy partners – we need to rewrite the tests themselves.
Because as long as the leaderboards keep rewarding lucky guesses, chatbots will keep bluffing.
AI
Examining the Timing of Salesforce’s AI-Fueled Job Cuts Announcement
AI
Want to Make the Most Out of Copilot? Microsoft CEO Satya Nadella Reveals 5 Prompts He Uses
AI
Escaping the Trough of Disillusionment: How Strategic Partnership Prevents AI Abandonment
AI
AI Agents Slash 4,000 Salesforce Jobs, Months After CEO Downplays AI’s Risks on Jobs
AI
Microsoft Unveils First In-House AI Models: Could Copilot Move Away from OpenAI?
AI
Why Webex by Cisco Is the Secure, AI-Powered Collaboration Platform Businesses Can Actually Trust
Share This Post
AI
Examining the Timing of Salesforce’s AI-Fueled Job Cuts Announcement
AI
Want to Make the Most Out of Copilot? Microsoft CEO Satya Nadella Reveals 5 Prompts He Uses
AI
Escaping the Trough of Disillusionment: How Strategic Partnership Prevents AI Abandonment
Get our Free Weekly Newsletter, straight to your inbox!
Handpicked News, Reviews and Insights delivered to you every week.
Tech
Vendors
Topics
Popular Zones
About
More
All content © Today Digital 2025

source

ZoomYourWeb3

Why Chatbots Still Hallucinate – and How OpenAI Wants to Fix It – UC Today

Contact Us

Quick Links