AI Chatbots Can be Tricked to Bypass Safety Features by Using Poetry, New Study Reveals – Tech Times

Welcome to the forefront of conversational AI as we explore the fascinating world of AI chatbots in our dedicated blog series. Discover the latest advancements, applications, and strategies that propel the evolution of chatbot technology. From enhancing customer interactions to streamlining business processes, these articles delve into the innovative ways artificial intelligence is shaping the landscape of automated conversational agents. Whether you’re a business owner, developer, or simply intrigued by the future of interactive technology, join us on this journey to unravel the transformative power and endless possibilities of AI chatbots.
Speaking or writing like a poet is no longer used for daily conversations in this day and age, but a new study discovered that using poetry to talk to an AI chatbot could bypass its safety guardrails.
A new study published by Icaro Labs found a new exploit on chatbots that banks on poetry to bypass the generative AI’s safety guardrails and gain access to prohibited content or topics.
The study, titled “Adversarial Poetry as a Universal Single-Turn Jailbreak Mechanism in Large Language Models,” details how the researchers were able to discover it and test if it could indeed bypass the systems.
The study shows how they transformed regular, conversational prompts into a poetic form of speaking to the AI, and this helped them achieve a stellar success rate to trick it into giving out prohibited responses.
According to the researchers, their tests found that using the poetic form “operates as a general-purpose jailbreak operator” for the chatbot.
The team behind the study did not reveal the poem-style prompts they used for the jailbreak attempts to crack chatbots into sharing prohibited content nor did they publish them in their paper.
According to the team (via Wired), these prompts are too dangerous to share with the public, particularly as in the study, they were able to squeeze out information from the chatbot on sensitive content.
According to Icaro Labs, they were able to ask the chatbot for the steps or materials needed to make a nuclear bomb, child sexual abuse materials (CSAM), or self-harm information.
Read Also: ChatGPT Adds Voice Mode Alongside Text So You Won’t Have to Switch Chats
According to the team, they tested this poetic form of prompt-making on popular chatbots in the industry, including OpenAI’s ChatGPT, Google’s Gemini, Anthropic’s Claude, and more.
Based on their findings, LLMs like the Google Gemini, DeepSeek, and MistralAI were the chatbots that consistently gave answers after using the poetic exploit to extract prohibited content from it. The company did not reveal what kind of specific answers they got from these chatbots.
That being said, OpenAI’s ChatGPT with GPT-5 and Anthropic’s Claude with the Haiku 4.5 model were the better performers as the researchers claimed that they were the “least” ones to get bypassed by using the poetic form.
Related Article: Google Limits Access to Free Nano Banana Pro, Gemini 3 Pro Due to High Demand
ⓒ 2025 TECHTIMES.com All rights reserved. Do not reproduce without permission.

source

Scroll to Top