Anthropic Says Its AI Chatbot Was Used By Chinese Hackers for Large-Scale Cyber Attack - SFist

Welcome to the forefront of conversational AI as we explore the fascinating world of AI chatbots in our dedicated blog series. Discover the latest advancements, applications, and strategies that propel the evolution of chatbot technology. From enhancing customer interactions to streamlining business processes, these articles delve into the innovative ways artificial intelligence is shaping the landscape of automated conversational agents. Whether you’re a business owner, developer, or simply intrigued by the future of interactive technology, join us on this journey to unravel the transformative power and endless possibilities of AI chatbots.
In what sounds like both a word of warning and weirdly a little bit of bragging, SF-based Anthropic says that its AI chatbot Claude was used by state-sponsored hackers in China to commit a large-scale cyberattack on American companies.
Anthropic’s Claude chatbot was reportedly used to commit a large-scale cyberattack on around 30 American companies two months ago, and it’s hard not to feel like Anthropic doesn’t hate the honor of being the first company whose chatbot has been employed in this nefarious way.
As the Wall Street Journal was first to report, state-sponsored Chinese hackers used Claude to collect user names and passwords from the databases of over two dozen tech companies, financial institutions, chemical manufacturers, and government agencies. They then used any valid login information to steal private data.
Reportedly, only a “small number” of these attacks were successful, but the scope of the damage is not clear.
“We believe this is the first documented case of a large-scale cyberattack executed without substantial human intervention,” says Anthropic in a statement.
“While we predicted these capabilities would continue to evolve, what has stood out to us is how quickly they have done so at scale,” the statement adds.
Anthropic says it began to suspect the hacker activity in September, noting that the hackers used Claude’s “agentic” capabilities “to an unprecedented degree—using AI not just as an advisor, but to execute the cyberattacks themselves.”
“Upon detecting this activity, we immediately launched an investigation to understand [the attack’s] scope and nature,” the company says. “Over the following ten days, as we mapped the severity and full extent of the operation, we banned accounts as they were identified, notified affected entities as appropriate, and coordinated with authorities as we gathered actionable intelligence.”
The attack is notable because of how it exploited AI agents to do much of the gruntwork of stealing data, with great speed.
“The sheer amount of work performed by the AI would have taken vast amounts of time for a human team,” Anthropic says. “At the peak of its attack, the AI made thousands of requests, often multiple per second — an attack speed that would have been, for human hackers, simply impossible to match.”
Anthropic acknowledges that while AI agents can be “valuable for everyday work and productivity,” they bring with them substantial peril when it comes to cybersecurity — given that they “can be run autonomously for long periods of time and … complete complex tasks largely independent of human intervention.”
As such attacks grow in size and scope, Anthropic says “we’ve expanded our detection capabilities and developed better classifiers to flag malicious activity.”
Anthropic is providing what may be too much transparency in a blog post, describing exactly how the hackers worked to jailbreak Claude and break down tasks into smaller tasks, convincing the chatbot that it was not doing anything nefarious. But, they say, the methods are likely to be replicated, so it is publicizing this attack in the interest of “threat sharing,” and encouraging the creation of “improved detection methods, and stronger safety controls.”
This report comes five months after another terrifying report from Anthropic about how it had observed, through its own stress-testing, that multiple large-language AI models, including its own, especially working in “agentic” mode, will resort to harmful behaviors like blackmail or even passive manslaughter if their own existence is threatened.
More safety research was needed, the report concluded, to prevent these “agentic misalignment concerns.”
Previously: Alarming Study Suggests Most AI Large-Language Models Resort to Blackmail, Other Harmful Behaviors If Threatened
Dust that collects on and near insulator caps along BART’s electrified third rail was potentially to blame for a scary August 29 incident in which a fire in the Transbay Tube led to smoke entering a moving train.
Get the latest posts delivered right to your inbox
Jay C. Barmann is a fiction writer and web editor who's lived in San Francisco for 20+ years.
Stay up to date! Get all the latest & greatest posts delivered straight to your inbox

source

ZoomYourWeb3

Anthropic Says Its AI Chatbot Was Used By Chinese Hackers for Large-Scale Cyber Attack – SFist

Contact Us

Quick Links