Why ChatGPT 'hallucinates'? OpenAI blames testing methods - Storyboard18

Welcome to the forefront of conversational AI as we explore the fascinating world of AI chatbots in our dedicated blog series. Discover the latest advancements, applications, and strategies that propel the evolution of chatbot technology. From enhancing customer interactions to streamlining business processes, these articles delve into the innovative ways artificial intelligence is shaping the landscape of automated conversational agents. Whether you’re a business owner, developer, or simply intrigued by the future of interactive technology, join us on this journey to unravel the transformative power and endless possibilities of AI chatbots.
ADVERTISEMENT
When ChatGPT or similar tools make up facts with confidence, it’s not because they’re “lying” but because of how they’ve been trained and tested, OpenAI has revealed. The company says fixing artificial intelligence hallucinations may require rethinking how AI performance is measured, not just how models are built.
Hallucinations, in AI terms, occur when a chatbot generates answers that sound convincing but are factually incorrect. In one example, researchers found the system invented details about a scientist’s PhD dissertation and even gave the wrong birthday. The problem, OpenAI argues, comes less from flawed memory and more from incentives baked into evaluation.
Most current benchmarks reward a correct answer but treat an “I don’t know” response as failure. This encourages models to “guess,” much like students taking a multiple-choice test. Over time, AI learns that sounding confident—even when wrong, is better than admitting uncertainty.
“Instead of rewarding only accuracy, tests should penalize confident mistakes more than honest admissions of uncertainty,” OpenAI suggested in its latest research. In short, honesty should count more than bold but wrong answers.
Read More: OpenAI’s boldest marketing bet in India: a likely ₹300 crore Ad war chest
The way large models are trained also plays a role. They learn by predicting the “next word” in billions of sentences, which works well for grammar and common facts but breaks down for rare or specific details, such as birthdays or niche research topics.
Interestingly, OpenAI noted that smaller models sometimes manage uncertainty better, avoiding risky guesses compared to their larger counterparts. This shows hallucinations are not an unfixable glitch but a matter of designing better guardrails.
Read More: AI data centres to drive global water usage 11-fold by 2028: Morgan Stanley
Read More: OpenAI-backed AI film ‘Critterz’ heads into production, targets Cannes premiere
ADVERTISEMENT
Today’s B2B marketers wear many hats: strategist, technologist, and storyteller.
ADVERTISEMENT
The legislation aims to regulate e-sports, educational and social gaming while imposing a blanket ban on online money games that involve monetary stakes.
ADVERTISEMENT
How it Works
Tech layoffs 2025: The biggest job cuts in Silicon Valley and beyond
How it Works
Why reach and impressions aren’t enough anymore: WPP Media’s Amin Lakhani on media’s next big shift
Brand Makers
Jensen Huang's children Madison and Spencer skipped the fast track – and started at Nvidia as interns
Brand Makers
Microsoft’s Puneet Chandok on the books and ideas that shape great leaders
Advertising
Signpost FY25 revenue rises 17% to ₹453 crore; DOOH, transit expansion drive growth despite profit dip
Digital
OpenAI forecasts $115 billion in spending through 2029
Digital
OpenAI prepares AI-powered jobs platform to rival LinkedIn
Digital
Why ChatGPT 'hallucinates'? OpenAI blames testing methods
Digital
AI data centres to drive global water usage 11-fold by 2028: Morgan Stanley
Digital
OpenAI-backed AI film ‘Critterz’ heads into production, targets Cannes premiere
Storyboard18 today has grown into the premier, multi-media destination for the news and the views that matter to the A&M community. In a short span of time, through its pioneering content and properties, Storyboard18 has become an aspiration platform where a mention matters more than the rest, setting the agenda and creating an impact for individuals, brands and businesses. Storyboard18 probes and provokes, igniting heated debates and discourse on the issues and topics that matter. Its breadth of content has grown to include trend-setting coverage of not only the advertising, marketing and media industries, but also startups, policy and tech. Storyboard18's IPs slate has grown to include marquee, aspirational properties like The Visionaries and Share The Spotlight. With its digital depth and television presence through two shows – Media Dialogues With Storyboard18 and the Storyboard18 weekend show, the brand has solidified its position as the apex platform for the A&M industry.

Partner with Us:

For sales and collaboration queries, reach out to

Sugandha.Mukerjei@nw18.com

Have a query? Got feedback? Want to share tips or ideas? Our team would be happy to hear from you. Get in touch with us here:
Storyboard18@nw18.com

source

ZoomYourWeb3

Why ChatGPT 'hallucinates'? OpenAI blames testing methods – Storyboard18

Contact Us

Quick Links