Anthropic Accuses Alibaba of Illicitly Accessing Claude Models in Distillation Attack

Related

Anthropic Takes Latest AI Models Offline to Comply With Export Controls

What happened Anthropic said it has taken its latest artificial...

Anthropic Releases Claude Fable 5 With Guardrails for Cybersecurity and Biology

What happened Anthropic is making Claude Fable 5 available to...

Attackers Abuse Google Ads and Claude.ai Shared Chats to Push Mac Malware

What happened An active malvertising campaign is abusing Google sponsored...

NSA Confirms Use of Anthropic’s Mythos Despite Pentagon Blacklist

What happened The NSA is actively deploying Anthropic's Mythos Preview,...

Share

What happened

Anthropic accused Alibaba of orchestrating a large unauthorized extraction campaign targeting its Claude AI models.

In a June 10, 2026 letter to U.S. Senate Banking Committee Chair Tim Scott and Ranking Member Elizabeth Warren, Anthropic alleged that operators affiliated with Alibaba and its AI research division, Alibaba Qwen, conducted a coordinated campaign to harvest capabilities from Claude.

The campaign allegedly ran from April 22 to June 5, 2026 and generated more than 28.8 million exchanges with Claude through nearly 25,000 fraudulent accounts.

Anthropic said the operation targeted Claude’s most advanced and commercially valuable capabilities, including software engineering and agentic reasoning. The company described those capabilities as central to its Mythos Preview model.

The alleged attack relied on adversarial distillation, where a less capable AI model is trained on outputs from a more powerful model to mimic its capabilities at lower development cost.

Anthropic warned that this type of activity could allow Chinese AI labs to replicate frontier U.S. AI capabilities without paying the same research, development, and compute costs required to train advanced models from scratch.

The company also warned that AI systems built through adversarial distillation may lack safety guardrails, creating risks beyond intellectual property theft.

Anthropic said the alleged Alibaba-linked activity is part of a broader pattern. In February 2026, the company disclosed a separate scheme involving DeepSeek and two other Chinese AI labs attempting to illicitly access Claude’s platform.

The disclosure is already drawing policy attention in Washington. Senators Bill Hagerty and Andy Kim are reportedly moving to introduce an amendment to defense legislation that would blacklist or sanction Chinese firms found improperly accessing U.S. AI model outputs to train competing systems.

Alibaba has not responded to requests for comment.

Who is affected

Anthropic is directly affected because the company alleges its Claude models were targeted in a large-scale capability-harvesting campaign.

The broader AI sector is also affected, especially frontier model developers whose commercial value depends on expensive research, model training, safety tuning, and proprietary capabilities.

U.S. policymakers, enterprises using AI platforms, and security leaders evaluating AI supply chain risk are also affected because the incident raises questions about model access abuse, output harvesting, account fraud, and cross-border AI competition.

Organizations using AI models for software engineering, agentic workflows, or sensitive enterprise tasks should pay attention because the alleged campaign targeted exactly those high-value capabilities.

Why CISOs should care

This case shows that AI model security is becoming a direct enterprise and national security concern. The alleged attack did not rely on stealing source code or breaching a traditional network. It reportedly used fraudulent accounts and massive interaction volume to extract model behavior through normal-looking access.

For CISOs, that shifts the security problem from only protecting infrastructure to also protecting model capabilities. AI providers and enterprise AI teams need controls that detect abuse patterns, synthetic account creation, abnormal usage volume, automated querying, and attempts to systematically reproduce model outputs.

The incident also highlights the risk of adversarial distillation. If attackers can use model outputs to train competing or less-guarded systems, the damage may include intellectual property loss, weakened safety controls, and wider availability of advanced AI capabilities outside the original provider’s governance framework.

The alleged focus on software engineering and agentic reasoning is especially relevant for security leaders. These are the same capabilities organizations are beginning to integrate into development, automation, and operational workflows, making abuse detection and model governance more important.

3 practical actions

  1. Monitor AI platforms for industrial-scale output harvesting: Anthropic alleged that the campaign generated more than 28.8 million exchanges through nearly 25,000 fraudulent accounts. CISOs should track account creation patterns, usage spikes, repeated prompt structures, and coordinated querying that may indicate model extraction.
  2. Treat model capabilities as protected intellectual property: The alleged campaign targeted software engineering and agentic reasoning capabilities. Organizations building or fine-tuning AI systems should apply access controls, usage policies, telemetry, rate limits, and abuse detection around high-value model functions.
  3. Assess AI vendors for anti-abuse and distillation defenses: Enterprises using external AI platforms should ask how vendors detect fraudulent accounts, automated querying, adversarial distillation, and large-scale scraping of model outputs. Vendor reviews should include model security, safety controls, and incident response processes for AI abuse.
IMG 0514 2
+ posts

John Kevin Hao is a news and feature writer covering cybersecurity, technology, and business targeted for professional audiences.