Claude Code Found 500 Security Bugs That Experts Missed for Decades. Moravec's Paradox Explains Why AI Cracked Cybersecurity First.

Abhishek Gautam

AI Security Tech

Claude Code Found 500 Security Bugs That Experts Missed for Decades. Moravec's Paradox Explains Why AI Cracked Cybersecurity First.

Abhishek GautamFebruary 25, 20269 min read

Claude Code Found 500 Security Bugs That Experts Missed for Decades. Moravec's Paradox Explains Why AI Cracked Cybersecurity First.

Quick summary

Anthropic's Claude Code can scan an entire codebase and find security vulnerabilities the way a skilled hacker would — and it already caught 500 real bugs in open source projects that human experts had missed for years. The reason this happened before AI learned to fold laundry is Moravec's Paradox, and it tells us something important about which jobs are actually safe.

What Claude Code's Security Scanner Actually Does

Claude Code is Anthropic's AI-powered development tool. The security scanning capability works differently from traditional static analysis tools like linters or SAST scanners.

Traditional security scanners look for known patterns. They have rule sets: "if a user input reaches a database query without sanitization, flag it." They are essentially matching against a library of known vulnerability signatures. They are fast, cheap, and they miss anything that does not match their rules. Novel attack vectors, context-dependent vulnerabilities, and logic flaws that require understanding business intent rather than just code structure routinely escape them.

Claude Code approaches a codebase the way a security researcher would. It reads the code, builds a mental model of what the system is trying to do, and then reasons about what could go wrong from an attacker's perspective. It asks: what assumptions is this code making? Where does it trust data it should not trust? Where does it handle edge cases differently from the happy path in ways that an attacker could exploit? What happens if this function receives input that is technically valid but semantically unexpected?

This kind of reasoning requires understanding code at a semantic level, not just a syntactic one. It requires the ability to trace data flows across function boundaries, across files, across service calls. It requires holding the entire context of what a piece of software is trying to accomplish while simultaneously thinking about how to break it.

That is exactly what large language models turn out to be very good at.

Moravec's Paradox and Why This Was Predictable

Hans Moravec was a roboticist at Carnegie Mellon who in 1988 articulated something that was immediately obvious in retrospect: the tasks humans find hardest are often the easiest for computers, and the tasks humans find trivially easy are often the hardest for computers.

Solving a differential equation is hard for a human. A calculator does it instantly. Playing chess at grandmaster level requires years of dedicated study for a human. A chess engine running on a laptop beats every human alive. These feel like cognitively demanding tasks, and they are, for humans. But they are demanding precisely because they require formal manipulation of symbols according to explicit rules. For computers, that is the easy part.

Now consider what a two-year-old can do. A toddler can recognize a face in any lighting, at any angle, partially occluded, in a painting, in a cartoon, in a shadow. A toddler can pick up an irregularly shaped object without dropping it. A toddler can understand that the same word said by a stranger means something different from when it is said by a parent. A toddler can navigate a cluttered room without walking into anything.

These things feel trivially easy because evolution spent hundreds of millions of years building the hardware and software to do them. We do not experience them as effortful because they are handled by parts of our brain that operate below conscious awareness. Getting a robot to do any of them reliably is one of the hardest unsolved problems in computer science.

Moravec's insight was that the hierarchy of difficulty is inverted. The things humans think of as hard because they require conscious effort are often easy for machines. The things humans do without thinking are nearly impossible for machines.

How Moravec's Paradox Maps to AI in 2026

This framework predicts the actual order in which AI has been automating human capabilities with remarkable accuracy.

AI won at chess in 1997. At Go in 2016. It solved protein folding in 2020. It writes code well enough to pass coding interviews at major tech companies. It passed the bar exam. It scores in the 90th percentile on standardized tests designed to measure human intelligence. It found 500 security bugs that expert humans missed.

These are all things that humans consciously identify as hard. They require years of study, expertise, and practice for humans to become proficient at. They feel like the last things a machine should be able to do. But from a computational perspective, they are formal, rule-bound, and well-defined enough that large models trained on the right data can learn to do them.

Meanwhile, AI still cannot fold laundry reliably. A robotic system cannot reliably pick up a random object from a pile and place it precisely somewhere else without extensive setup and controlled conditions. AI cannot navigate a completely novel physical environment safely without careful engineering. These feel trivially easy to humans because our motor cortex and visual processing systems evolved specifically to do them. For a machine, they require solving extraordinarily hard problems in real-time physical reasoning and manipulation.

The implication for AI development is striking. The cognitive task that cybersecurity professionals spend decades mastering — understanding code deeply enough to reason about what an attacker would do — is closer to the "chess" end of Moravec's paradox than the "folding laundry" end. It is formal reasoning about complex but ultimately symbolic systems. AI is very good at this.

What the 500 Bugs Actually Mean

The specific claim that Claude Code found 500 security bugs in open source projects that experts had missed deserves some context.

Open source code is often high quality precisely because it is reviewed by many people with strong incentives to find problems. Bugs in widely-used open source libraries can affect millions of systems, so security researchers actively look for them. The CVE database exists specifically to catalog known vulnerabilities. Projects with active communities conduct security audits.

Finding 500 real, previously unknown vulnerabilities in code that has been reviewed by experienced humans is not a marginal improvement over existing tools. It is a qualitative shift in capability. It means AI security scanning is finding things that human security researchers, running traditional tools, with active incentives to find problems, missed.

Some of those 500 bugs have likely been in code for years. Some are in code that runs on systems you use. Some were probably known attack vectors in theory but never specifically identified in that specific codebase.

The practical implication is that the economics of security are changing. A security audit that previously required weeks of expensive expert time can now be done more thoroughly and faster with AI assistance. For defenders, this is good news: finding and fixing vulnerabilities before attackers do becomes faster and cheaper. For the security consulting industry, it represents the same pressure that every other knowledge work profession is facing.

What This Means for Security Professionals

The reasonable question is whether Claude Code and similar tools make security professionals obsolete. The answer is: not the good ones, and not soon, but the pressure is real.

The work that AI can now do well is the pattern-based, rule-based work. Scanning code for known classes of vulnerabilities. Identifying common misconfigurations. Checking that cryptographic implementations follow best practices. Reviewing authentication and authorization logic for obvious flaws. This is a significant portion of what junior and mid-level security professionals spend their time on.

The work that remains genuinely hard is the adversarial, creative, and contextual work. Designing novel attack vectors against a system you have not seen before. Understanding the business context well enough to identify what an attacker would actually want to achieve. Red-teaming an organization across its people, processes, and technology together, not just its code. Incident response when something has already gone wrong. Threat modeling for a new architecture before the code exists.

These require judgment, creativity, and contextual understanding that current AI systems handle poorly. They are closer to the "folding laundry" end of Moravec's paradox — they require the kind of general world knowledge and situational awareness that AI has not solved.

The security professionals who will feel pressure soonest are those whose value is primarily in doing work that AI now does faster and cheaper. The ones who are safe are those whose value is in the judgment, creativity, and context that sits above the code.

The Broader Lesson

Cybersecurity joining the list of things AI does surprisingly well is not an anomaly. It is Moravec's paradox playing out in real time.

The pattern is consistent. Every time people said "AI cannot do X because X requires deep human expertise and creativity," AI eventually did X — sometimes worse than the best humans, sometimes better. And the order in which it happened tracked Moravec's logic: formal reasoning first, physical and social reasoning last.

The things still safe from AI are the things humans do without consciously trying: reading a room, building trust with a stranger, navigating genuinely novel physical situations, understanding the unspoken context that surrounds any explicit communication. These are hard not because humans spent years learning them but because evolution spent eons building the hardware to do them effortlessly.

The jobs that are genuinely safe are the ones that require this kind of evolved human capability in combination with domain knowledge. The jobs under pressure are the ones where the domain knowledge is the main thing — the chess, the security scanning, the bar exam, the coding interview — and where the evolved human capability is not required.

Claude Code finding 500 bugs is a data point in a trend that has been consistent since at least 1997. The frontier keeps falling. Moravec predicted which frontiers would fall first, and he was right. Paying attention to his paradox is still the most useful framework for thinking about what comes next.

Free Weekly Briefing

The AI & Dev Briefing

One honest email a week — what actually matters in AI and software engineering. No noise, no sponsored content. Read by developers across 30+ countries.

No spam. Unsubscribe anytime.