Researchers Find ChatGPT Vulnerabilities That Let Attackers Trick AI Into Leaking Data - The Hacker News

Welcome to the forefront of conversational AI as we explore the fascinating world of AI chatbots in our dedicated blog series. Discover the latest advancements, applications, and strategies that propel the evolution of chatbot technology. From enhancing customer interactions to streamlining business processes, these articles delve into the innovative ways artificial intelligence is shaping the landscape of automated conversational agents. Whether you’re a business owner, developer, or simply intrigued by the future of interactive technology, join us on this journey to unravel the transformative power and endless possibilities of AI chatbots.
Cybersecurity researchers have disclosed a new set of vulnerabilities impacting OpenAI’s ChatGPT artificial intelligence (AI) chatbot that could be exploited by an attacker to steal personal information from users’ memories and chat histories without their knowledge.
The seven vulnerabilities and attack techniques, according to Tenable, were found in OpenAI’s GPT-4o and GPT-5 models. OpenAI has since addressed some of them.
These issues expose the AI system to indirect prompt injection attacks, allowing an attacker to manipulate the expected behavior of a large language model (LLM) and trick it into performing unintended or malicious actions, security researchers Moshe Bernstein and Liv Matan said in a report shared with The Hacker News.
The identified shortcomings are listed below –
The disclosure comes close on the heels of research demonstrating various kinds of prompt injection attacks against AI tools that are capable of bypassing safety and security guardrails –
The findings show that exposing AI chatbots to external tools and systems, a key requirement for building AI agents, expands the attack surface by presenting more avenues for threat actors to conceal malicious prompts that end up being parsed by models.
“Prompt injection is a known issue with the way that LLMs work, and, unfortunately, it will probably not be fixed systematically in the near future,” Tenable researchers said. “AI vendors should take care to ensure that all of their safety mechanisms (such as url_safe) are working properly to limit the potential damage caused by prompt injection.”
The development comes as a group of academics from Texas A&M, the University of Texas, and Purdue University found that training AI models on “junk data” can lead to LLM “brain rot,” warning “heavily relying on Internet data leads LLM pre-training to the trap of content contamination.”
Last month, a study from Anthropic, the U.K. AI Security Institute, and the Alan Turing Institute also discovered that it’s possible to successfully backdoor AI models of different sizes (600M, 2B, 7B, and 13B parameters) using just 250 poisoned documents, upending previous assumptions that attackers needed to obtain control of a certain percentage of training data in order to tamper with a model’s behavior.
From an attack standpoint, malicious actors could attempt to poison web content that’s scraped for training LLMs, or they could create and distribute their own poisoned versions of open-source models.
“If attackers only need to inject a fixed, small number of documents rather than a percentage of training data, poisoning attacks may be more feasible than previously believed,” Anthropic said. “Creating 250 malicious documents is trivial compared to creating millions, making this vulnerability far more accessible to potential attackers.”
And that’s not all. Another research from Stanford University scientists found that optimizing LLMs for competitive success in sales, elections, and social media can inadvertently drive misalignment, a phenomenon referred to as Moloch’s Bargain.
“In line with market incentives, this procedure produces agents achieving higher sales, larger voter shares, and greater engagement,” researchers Batu El and James Zou wrote in an accompanying paper published last month.
“However, the same procedure also introduces critical safety concerns, such as deceptive product representation in sales pitches and fabricated information in social media posts, as a byproduct. Consequently, when left unchecked, market competition risks turning into a race to the bottom: the agent improves performance at the expense of safety.”
Static defenses overwhelm teams with vuln lists. Learn how automation and context-driven reduction close real risks faster.
The future of GRC isn’t coming—it’s already here, powered by AI that learns, adapts, and audits itself.
Get the latest news, expert insights, exclusive resources, and strategies from industry leaders – all for free.

source

ZoomYourWeb3

Researchers Find ChatGPT Vulnerabilities That Let Attackers Trick AI Into Leaking Data – The Hacker News

Contact Us

Quick Links