OpenAI Research Shows Why Chatbots Guess Wrong Under Current Tests – TipRanks

September 13, 2025

Connect With Us

Diella: Albania Appoints ‘AI...

Amorepacific Unveils 'AMORE CHAT,'...

Transforming the banking experience...

# Tags

Advanced Plugin Tools

AI Artistry

AI Chatbots

AI Content Creation

AI Development Tools

AI Driven Marketing Strategies

AI Image Generation

AI in eCommerce

AI in Email Marketing

AI Marketing Solutions

AI Plugins

AI Programming Languages

AI SEO Tools

AI Software Development

AI Website Design

AI-Generated Images

AI-Powered Email Campaigns

AI-Powered Web Design

Artificial Intelligence for Online Retail

Artificial Intelligence in Digital Marketing

Artificial Intelligence in SEO

Artificial Intelligence Integration

Automated Customer Interactions

Automated Email Writing

Automated Website Development

Automated Writing

Content Automation Tools

Conversational AI

Creative AI Algorithms

Deep Learning Art

Deep Learning Libraries

eCommerce Automation

eCommerce Optimization with AI

Generative Adversarial Networks (GANs)

Intelligent Content Generation

Intelligent Conversational Agents

Intelligent Email Automation

Machine Learning Add-ons

Machine Learning For Marketing

Machine Learning Frameworks

Machine Learning SEO

Machine Learning Websites

Marketing Automation AI

Natural Language Processing

Natural Language Processing in Emails

Predictive Analytics for Retail

SEO AI Optimization

SEO Automation Software

Smart Plugin Solutions

Smart Website Creation

Tech

Trending

#Chatbots

OpenAI Research Shows Why Chatbots Guess Wrong Under Current Tests – TipRanks

Team ZYT Web3 / 5 days
September 8, 2025
0
3 min read

Welcome to the forefront of conversational AI as we explore the fascinating world of AI chatbots in our dedicated blog series. Discover the latest advancements, applications, and strategies that propel the evolution of chatbot technology. From enhancing customer interactions to streamlining business processes, these articles delve into the innovative ways artificial intelligence is shaping the landscape of automated conversational agents. Whether you’re a business owner, developer, or simply intrigued by the future of interactive technology, join us on this journey to unravel the transformative power and endless possibilities of AI chatbots.
OpenAI, together with Georgia Tech, has released new research that takes a close look at why chatbots keep making errors. The study argues that the root issue is not in how the systems are built but in how they are trained and scored. Current evaluation tests grade answers as right or wrong with no reward for admitting a lack of knowledge. As a result, models such as ChatGPT from OpenAI and DeepSeek-V3 learn to guess with confidence instead of holding back when unsure.

The team shows that hallucinations, or incorrect answers, follow the same math rules as simple test errors. For instance, if a fact only shows up one time in training data, the model will almost always struggle with it later. In a test, even leading models gave several wrong birthdays for one of the authors, rather than saying they did not know. This shows how the push to answer outweighs the push to pause.
The researchers suggest that the fix lies in how answers are scored. They propose a new system that gives points for correct answers, removes points for wrong ones, and leaves a zero score for a clear “I don’t know.” In trials, models that skipped answers more often ended up with fewer errors overall, even though their accuracy rate looked lower on paper.
For investors and users, this study highlights that the problem of AI errors is tied to training rules more than hidden faults. It also shows that better scoring could build more trust in AI systems used in fields such as finance, health, and law. Trust is a keyword for all AI systems. Naturally, the more we trust the AI chatbot, the greater the potential to boost the company’s top line.
Using TipRanks’ Comparison Tool, we analyzed several leading companies developing AI chatbots similar to ChatGPT. This side-by-side view helps investors better understand each stock as well as the broader AI chatbot market.
Disclaimer & Disclosure Report an Issue

source

Diella: Albania Appoints ‘AI...

Top 7 AI Chatbot...

India Generative AI Market...

THE IMPACT OF AI...

Amorepacific Unveils 'AMORE CHAT,'...

Transforming the banking experience...

OpenAI Research Shows Why Chatbots Guess Wrong Under Current Tests – TipRanks

Diella: Albania Appoints ‘AI Minister’ to.

Top 7 AI Chatbot Development Companies.

India Generative AI Market Research Report.

THE IMPACT OF AI CHATBOTS ON.