McKinsey’s AI chatbot hack reveals the security risks agentic AI poses – CXOToday.com

Welcome to the forefront of conversational AI as we explore the fascinating world of AI chatbots in our dedicated blog series. Discover the latest advancements, applications, and strategies that propel the evolution of chatbot technology. From enhancing customer interactions to streamlining business processes, these articles delve into the innovative ways artificial intelligence is shaping the landscape of automated conversational agents. Whether you’re a business owner, developer, or simply intrigued by the future of interactive technology, join us on this journey to unravel the transformative power and endless possibilities of AI chatbots.



Agentic AI is turning out to be the real force multiplier for threat actors by allowing them to automate 80-90% of a cyberattack with minimal human intervention. Last November, Anthropic reported a sophisticated espionage campaign where a China-backed threat actor group manipulated Claude Code’s agentic capabilities to target close to thirty global organizations. 
This week, security researchers at CodeWall used AI agents to break into McKinsey & Company’s internal generative AI chatbot Lilli. Released in August 2023, Lilli served as a centralized intelligence hub that can quickly search and summarize decades of research, allowing analysts and partners to deliver quick advice to clients. According to McKinsey, Lilli processes 500,000 prompts every month and is used  by 72% of the firm’s employees. 
Researchers at CodeWall claim that they were able to break into Lilli and gain full read and write access in just two hours using an offensive AI agent with no insider knowledge or humans in the loop. They claim to have access to 46.5 million chat messages that include 728,000 files, and 57,000 user accounts. 
CodeWall assured that the attack wasn’t carried out with malicious intention, but to demonstrate how threat actors are increasingly using AI agents in real world attacks. The researchers claim that McKinsey was suggested as a target by its autonomous agent due to their public responsible disclosure policy and recent updates to their Lilli platform. 
“In the AI era, the threat landscape is shifting drastically — AI agents autonomously selecting and attacking targets will become the new normal,” researchers said in a blog post, published March 10.
According to the researchers, the AI agent scanned every end point McKinsey’s system interacted with and found 200 entry points, most of which were locked except 22 that were left open.  
One of the open endpoints had a flaw. It trusted the user’s input (JSON keys) and treated them as part of the database’s internal code (SQL). This allowed the agent to send instructions to the database. Instead of getting access denied messages, the system sent back error messages that repeated the AI’s messages. The agent then performed a brute force attack over 15 iterations using system errors to map the database’s structure, allowing it to pull data including user interactions with Lilli. 
The SQL injection flaw in Lilli was found in February, and a responsible disclosure email was sent to McKinsey’s security team with a high-level impact summary on March 1. Within a day, McKinsey claims to have successfully patched all unauthenticated endpoints and blocked public API documentation. 
Further, CodeWall found that the SQL injection wasn’t read only, which means that an attacker with write access could have rewritten Lilli’s prompt through the same injection attack. This could have allowed attackers to poison the AI chatbot’s advice, instruct it to embed confidential information in its responses, and also remove its guardrails. 
What is worrying is that the security team of a leading global consultancy firm with billions in annual revenue failed to detect a routine SQL injection vulnerability in an AI chatbot that was operational for over two years. Imagine the risk firms with limited resources are facing. 
CodeWall’s use of an AI agent to identify a zero-day vulnerability demonstrates how enterprises can leverage it to secure their systems before threat actors can exploit them. Anthropic claims that the very abilities that make models like Claude so relevant for threat actors also makes it a highly reliable tool for cybersecurity. 
AI-driven attacks can also be mitigated by hardening the guardrails around frontier AI models to prevent jailbreaking attempts and their misuse for launching machine speed attacks. According to Anthropic, the Chinese threat actor group tricked Claude to bypass its guardrails and broke down their attacks into small, seemingly innocent tasks that Claude didn’t suspect as malicious. They also told Claude that they were using it for defensive testing for a cybersecurity company. 
The attackers then used Claude Code for reconnaissance on target organizations systems and were able to quickly identify and test security vulnerabilities. It was then used to identify and extract high- value data  with minimal human supervision. 
According to Gartner, by 2027 more than 40% of AI-related data breaches worldwide will involve malicious use of GenAI. 
Palo Alto Networks’ Unit 42 team believes that attackers will increasingly use agentic AI to build agents with expertise in specific attack stages. When orchestrated together, these agents can autonomously find vulnerabilities, execute attacks,  and adjust tactics in real time. Unit 42 warned that agentic AI will give rise to a new class of adversaries that can carry out end-to-end cyberattacks with minimal human supervision.  
IBM’s Cost of Data Breach report 2025, shows that organizations that are using AI and automation in cybersecurity have reduced their breach times by 80 days and saved $1.9 million in average breach costs as compared to organizations that are not using AI for security. 
CXOtoday is a premier resource on the world of IT, relevant to key business decision makers. We offer IT perspective & news to the C-suite audience. We also provide business and technology news to those who evaluate, invest, and manage the IT infrastructure of organizations. CXOtoday has a well-networked and strong community that encourages discussions on what’s happening in the world of IT and its impact on businesses.
Copyright © 2025 Trivone. All Rights Reserved.
We use cookies to improve your experience on our site. By using our site, you consent to cookies.
Websites store cookies to enhance functionality and personalise your experience. You can manage your preferences, but blocking some cookies may impact site performance and services.
Essential cookies enable basic functions and are necessary for the proper function of the website.
You can find more information in our Privacy Policy and Privacy Policy.

source

Scroll to Top