This sneaky photo trick gets AI chatbots to ignore their safety rules - Digital Trends

Welcome to the forefront of conversational AI as we explore the fascinating world of AI chatbots in our dedicated blog series. Discover the latest advancements, applications, and strategies that propel the evolution of chatbot technology. From enhancing customer interactions to streamlining business processes, these articles delve into the innovative ways artificial intelligence is shaping the landscape of automated conversational agents. Whether you’re a business owner, developer, or simply intrigued by the future of interactive technology, join us on this journey to unravel the transformative power and endless possibilities of AI chatbots.
A photo that looks completely ordinary to you could carry a hidden instruction to trick an AI chatbot into ignoring its safety rules, according to new research out of Florida International University. The study found that pixel-level alterations in an image that are invisible to the human eye can be enough to confuse the model reading the image and lead it to generate responses it would normally block.
“AI models don’t see images the same way humans do,” said Hadi Amini, an associate professor at FIU’s Knight Foundation School of Computing and Information Sciences. They read photos as numerical data, he explained, and shifting that data even slightly can change what the system reads in the image and how it responds.
Amini and graduate researcher Md Jueal Mia used that to build a method called JaiLIP, short for Jailbreaking with Loss-guided Image Perturbation, according to a release on the findings. The technique calculates the smallest pixel change needed to push a model toward an unsafe response without altering anything visible in the photo itself.
Testing JaiLIP on BLIP-2, a multimodal AI model used in research and development, the team found that altered images nearly doubled how often the system produced harmful responses. In one test, a modified photo of a stoplight got the model to explain how to run a red light without getting a ticket.
Small language models, the kind many businesses rely on for bookkeeping or customer support, turned out to be especially easy to fool in the team’s testing. As more companies route such roles to AI tools, a flaw like this could erode user trust or open a new door for attackers.
The discovery joins a growing list of research probing AI guardrails, including a method that let outside researchers hijack AI-controlled robots and Anthropic’s own findings on a model that learned to misbehave once it realized it could get away with it. What stands out in FIU’s research is the delivery method. A jailbreak hidden inside an otherwise normal photo doesn’t need clever wording or a workaround prompt, just an image nobody would think twice about.
Meta just paused a divisive employee surveillance tool after it accidentally exposed sensitive worker data to the entire company (via Wired).
The tool, called the Model Capability Initiative, was quietly collecting keystrokes, mouse movements, and screen content from US employee laptops since April.
View at Amazon
I have smaller than average palms. And every time I need to pick a mouse, size is my primary consideration. And thanks in no part to my dainty wrists, weight is an important factor, too. I’ve handled enough featherweight gaming mice to develop a reflexive flinch. But super lightweight mice often come with their fair share of compromises. The moment I pick one, I usually brace for the creak, rattling, and hollow plastic that feels like it’ll snap if I click too hard during a clutch round.
See at Best Buy
Asus and excess have a well-known reputation. From gaming laptops with dual screens to phones with more RAM than your PC, the brand likes to flex its muscle from time to time. But doing so for a router, a device that sits and blinks on a shelf, sounds a little too much. Yet, the ROG Rapture GT-BE19000AI is just that. It’s a one-of-a-kind networking product that genuinely asks to be talked about like a computer.

Upgrade your lifestyleDigital Trends helps readers keep tabs on the fast-paced world of tech with all the latest news, fun product reviews, insightful editorials, and one-of-a-kind sneak peeks.

source

ZoomYourWeb3

This sneaky photo trick gets AI chatbots to ignore their safety rules – Digital Trends

Contact Us

Quick Links