Anthropic Says DeepSeek Used 24,000 Fake Accounts to Steal Claude. What Is a Distillation Attack?

Abhishek Gautam

AI Security Tech

Anthropic Says DeepSeek Used 24,000 Fake Accounts to Steal Claude. What Is a Distillation Attack?

Abhishek GautamFebruary 25, 20269 min read

Anthropic Says DeepSeek Used 24,000 Fake Accounts to Steal Claude. What Is a Distillation Attack?

Quick summary

Anthropic publicly accused DeepSeek, Moonshot AI, and MiniMax of running industrial-scale distillation attacks on Claude — 24,000 fraudulent accounts, 16 million exchanges, and extracted AI capabilities being fed into Chinese military and surveillance systems. Here is what actually happened and what it means.

What Is Model Distillation?

Model distillation is a legitimate, widely-used technique in AI. The idea is conceptually simple. You have a large, expensive model called the teacher. You use the teacher to generate outputs — answers, reasoning, code, text — and then you train a smaller, cheaper model called the student to reproduce those outputs. The student learns to mimic the teacher.

Done legitimately, distillation is how most AI labs create smaller versions of their models. OpenAI has distilled GPT-4 into smaller models. Google distills Gemini into lighter versions. Meta's Llama models have been distilled into smaller variants. This is normal, open practice. When Anthropic trains a smaller Claude to behave like a larger Claude, that is distillation.

The key is that you have the right to do it. You own the teacher model. You decide what gets distilled.

What Makes a Distillation Attack Different

A distillation attack is the same technical process with one critical difference: you are doing it to someone else's model without permission, at scale, in a way that deliberately evades detection.

Here is how it works in practice. You create thousands of accounts on the target service — in this case, Claude. You automate queries across those accounts to generate massive volumes of model outputs. You store those outputs. Then you use those outputs as training data to teach your own model to replicate the behavior of the model you were querying.

The result is that you have essentially transferred a significant portion of the target model's capabilities into your own system. You have done this without paying for the research, compute, and data that went into building the original model. And crucially, in the Chinese labs' case, Anthropic alleges you have stripped out safety guardrails in the process.

24,000 accounts and 16 million exchanges is not exploratory research. That is a coordinated, automated infrastructure project designed specifically to extract Claude's capabilities at scale. The accounts were fraudulent — created to evade rate limits, terms of service enforcement, and detection systems that would catch normal API abuse.

Why Anthropic Is Treating This as a National Security Issue

The technical theft is one thing. Anthropic's framing goes further, and it is worth taking seriously.

They specifically said that foreign labs illicitly distilling American models can remove safeguards, feeding model capabilities into their own military, intelligence, and surveillance systems. This is not a standard corporate complaint about IP theft. It is an argument that the capabilities being extracted are dual-use technologies with direct national security implications.

Claude has safety training that is supposed to prevent it from helping with certain categories of requests — weapons development, mass surveillance systems, disinformation at scale. A distilled model that reproduces Claude's general capabilities but has had those safety layers removed or weakened would be a genuinely dangerous tool in the hands of state actors with access to military and intelligence infrastructure.

This is what makes the accusation qualitatively different from a competitor copying features. It is Anthropic arguing that the theft is not just commercial but potentially contributing to capabilities that could be used against civilian populations or adversaries.

How Did They Get Caught?

Anthropic did not explain in detail how they detected the attack, and they likely will not, because doing so would teach future attackers what to avoid.

But the general detection approaches for this kind of abuse are known. At the scale described — 24,000 accounts and 16 million exchanges — the pattern of queries would look nothing like normal user behavior. Legitimate users ask diverse, unpredictable questions. Automated extraction systems generate queries that are more structured, more systematic, and more comprehensive in their coverage of a model's capability space. Query timing, account registration patterns, shared infrastructure signatures, and the statistical properties of the prompts themselves all create signals that abuse detection systems can flag.

The fact that they needed 24,000 accounts rather than a handful suggests Anthropic's detection systems were catching and banning them, and the attackers were cycling through new accounts to stay ahead of enforcement. That is an arms race, and it is presumably ongoing.

What This Means for Developers Using AI APIs

If you are building products on top of Claude, GPT-4, Gemini, or any commercial AI API, this incident has direct implications for you.

Every commercial AI provider now has strong incentive to monitor API usage more aggressively for distillation-like patterns. That means rate limits are likely to tighten. Terms of service around automated bulk usage will become stricter. Enterprise agreements will include more specific prohibitions on using model outputs to train competing models.

This is not a hypothetical. OpenAI's terms of service already explicitly prohibit using ChatGPT outputs to train models that compete with OpenAI. Anthropic's terms have similar language. The DeepSeek incident will accelerate enforcement and make violations more consequential.

For legitimate developers, the practical impact is mostly invisible — you are not running 24,000 accounts and 16 million queries. But if you are building applications that store and reuse large volumes of AI outputs, or if you are fine-tuning models on Claude-generated data, you should read the terms carefully. The line between legitimate use and distillation is a legal and contractual one, and it is likely to be drawn more sharply in the near future.

What Happens Next

Anthropic called for rapid, coordinated action among industry players, policymakers, and the broader AI community. That language suggests they are not planning to handle this purely through their own enforcement mechanisms. They want regulatory and industry-wide responses.

The most likely near-term outcomes are:

Better industry coordination on threat intelligence. If DeepSeek was hitting Claude, they were probably hitting GPT-4 and Gemini too. Labs sharing information about attack patterns would make detection faster across the industry.

Government involvement. The national security framing in Anthropic's announcement is not accidental. They are inviting the US government to treat AI model theft as a national security issue alongside chip export controls and data sovereignty regulations. Whether legislation follows is uncertain, but the policy conversation has moved significantly.

Technical countermeasures. Some AI labs are already experimenting with techniques that make distillation harder. Watermarking model outputs, adding subtle statistical signatures to responses, and adversarial perturbations that survive distillation and degrade the student model's quality are all active research areas. Expect more investment here.

The companies named — DeepSeek, Moonshot AI, MiniMax — are unlikely to face direct legal consequences in China. But the accusation affects their international credibility, their ability to partner with Western companies, and potentially their access to Western cloud infrastructure.

The Bigger Picture

The DeepSeek distillation attack sits inside a broader contest that Jensen Huang described as a five-layer competition between the US and China. At the model layer, Anthropic said American frontier models are ahead. The distillation attacks are an attempt to close that gap without doing the underlying research.

This is how technology competitions actually work at the frontier. When you cannot replicate the underlying capability, you try to steal the outcome. The semiconductor industry saw this pattern for decades. The AI model layer is now experiencing the same dynamic, and the defenses being developed now will shape how open or closed the AI ecosystem becomes over the next several years.

For anyone following AI closely, this announcement marks a shift. The era of AI labs sharing capabilities openly with all comers, regardless of who they are or what they do with the outputs, is ending. The commercial and security pressures are now large enough that some degree of restriction was probably inevitable. Anthropic naming names in public is the signal that the industry has decided the threat is real enough to confront directly.

Free Weekly Briefing

The AI & Dev Briefing

One honest email a week — what actually matters in AI and software engineering. No noise, no sponsored content. Read by developers across 30+ countries.

No spam. Unsubscribe anytime.