ChatGPT is obsessed with goblins – and it could be a problem - The Independent

Welcome to the forefront of conversational AI as we explore the fascinating world of AI chatbots in our dedicated blog series. Discover the latest advancements, applications, and strategies that propel the evolution of chatbot technology. From enhancing customer interactions to streamlining business processes, these articles delve into the innovative ways artificial intelligence is shaping the landscape of automated conversational agents. Whether you’re a business owner, developer, or simply intrigued by the future of interactive technology, join us on this journey to unravel the transformative power and endless possibilities of AI chatbots.
Notifications can be managed in browser preferences.
Please refresh the page or navigate to another page on the site to be automatically logged inPlease refresh your browser to be logged in
Swipe for next article
Strange bug with popular chatbot highlights underlying issues with AI
Removed from bookmarks
I would like to be emailed about offers, events and updates from The Independent. Read our Privacy notice
OpenAI has solved a “goblin mystery” impacting ChatGPT that caused the AI chatbot to become obsessed with the mythical creatures.
Over the last six months, mentions of the word ‘goblin’ have shot up in ChatGPT, even in response to unrelated queries. The phenomenon prompted an investigation by OpenAI researchers, who found that the bug “crept in subtly” following the release of a new ChatGPT model last November.
The new model was designed to be “smarter and more conversational” than its predecessors, featuring a variety of personality settings like ‘Nerdy’, ‘Candid’, and ‘Quirky’.
Shortly after its release, ChatGPT users and researchers began noticing a pattern of repeated mentions of goblins, gremlins and other fantasy creatures.
“Starting with GPT-5.1, our models began developing a strange habit: they increasingly mentioned goblins, gremlins, and other creatures in their metaphors,” OpenAI notes in a blog post about the issue.
“We unknowingly gave particularly high rewards for metaphors with creatures. From there, the goblins spread.”
Safety researchers at the company reported a 175 per cent increase in mentions of the word ‘goblin’ following the release of GPT-5.1 as a result of the model being incentivised to use playful metaphors.
The training method was not corrected for future models and when GPT-5.4 launched in March, use of ‘goblin’ had increased nearly 4,000 per cent in the Nerdy personality type, with mentions increasing by the same relative proportion across other models.
“The rewards were applied only in the Nerdy condition, but reinforcement learning does not guarantee that learned behaviors stay neatly scoped to the condition that produced them,” OpenAI noted.
“Once a style tic is rewarded, later training can spread or reinforce it elsewhere, especially if those outputs are reused in supervised fine-tuning or preference data.”
The glitch was relatively harmless in this instance, but it demonstrates a broader flaw with leading artificial intelligence models and the manner in which they are trained and developed.
Reinforcement learning and the use of reward signals can cause AI models to mutate in unexpected and unintended ways.
OpenAI said its research and safety team has built new ways to investigate rogue patterns and will be conducting more audits of of model behaviour in the future.
Join thought-provoking conversations, follow other Independent readers and see their replies
Please refresh the page or navigate to another page on the site to be automatically logged inPlease refresh your browser to be logged in

source

ZoomYourWeb3

ChatGPT is obsessed with goblins – and it could be a problem – The Independent

Contact Us

Quick Links