#Chatbots

OpenAI Research Shows Why Chatbots Guess Wrong Under Current Tests – TipRanks

Welcome to the forefront of conversational AI as we explore the fascinating world of AI chatbots in our dedicated blog series. Discover the latest advancements, applications, and strategies that propel the evolution of chatbot technology. From enhancing customer interactions to streamlining business processes, these articles delve into the innovative ways artificial intelligence is shaping the landscape of automated conversational agents. Whether you’re a business owner, developer, or simply intrigued by the future of interactive technology, join us on this journey to unravel the transformative power and endless possibilities of AI chatbots.
OpenAI, together with Georgia Tech, has released new research that takes a close look at why chatbots keep making errors. The study argues that the root issue is not in how the systems are built but in how they are trained and scored. Current evaluation tests grade answers as right or wrong with no reward for admitting a lack of knowledge. As a result, models such as ChatGPT from OpenAI and DeepSeek-V3 learn to guess with confidence instead of holding back when unsure.

The team shows that hallucinations, or incorrect answers, follow the same math rules as simple test errors. For instance, if a fact only shows up one time in training data, the model will almost always struggle with it later. In a test, even leading models gave several wrong birthdays for one of the authors, rather than saying they did not know. This shows how the push to answer outweighs the push to pause.
The researchers suggest that the fix lies in how answers are scored. They propose a new system that gives points for correct answers, removes points for wrong ones, and leaves a zero score for a clear “I don’t know.” In trials, models that skipped answers more often ended up with fewer errors overall, even though their accuracy rate looked lower on paper.
For investors and users, this study highlights that the problem of AI errors is tied to training rules more than hidden faults. It also shows that better scoring could build more trust in AI systems used in fields such as finance, health, and law. Trust is a keyword for all AI systems. Naturally, the more we trust the AI chatbot, the greater the potential to boost the company’s top line.
Using TipRanks’ Comparison Tool, we analyzed several leading companies developing AI chatbots similar to ChatGPT. This side-by-side view helps investors better understand each stock as well as the broader AI chatbot market.
Disclaimer & DisclosureReport an Issue

source

OpenAI Research Shows Why Chatbots Guess Wrong Under Current Tests – TipRanks

Newly Launched History AI Chat Gives Guided