AI Agent Hacked McKinsey's Platform in 2 Hours: 46 Million Messages Exposed

Abhishek GautamMarch 17, 20269 min read

AI Agent Hacked McKinsey's Platform in 2 Hours: 46 Million Messages Exposed

Quick summary

CodeWall's autonomous AI agent breached McKinsey's internal Lilli platform via SQL injection with no credentials. 46.5 million messages, 728K files, and system prompts exposed.

What Happened and How Fast

Security startup CodeWall disclosed the breach publicly in early March 2026. The target was McKinsey's internal AI platform, called Lilli — the firm's proprietary AI assistant used across its global consulting operations for strategy work, M&A analysis, and client engagements.

CodeWall's autonomous agent started with nothing but a publicly accessible URL. It found its way in through a vulnerability chain that took under two hours to exploit from first contact to full production database access. The agent operated without any human direction during the attack — it found the vulnerability, constructed the exploit, and escalated its own access autonomously.

The scale of what it accessed: 46.5 million chat messages, 728,000 files, 57,000 user accounts, 384,000 AI assistants, and 94,000 workspaces. These weren't test accounts or demo data — they were live production records from McKinsey's global consulting operations.

McKinsey acknowledged the breach and said a third-party forensics firm found no evidence that actual client data was accessed by CodeWall or any other unauthorised party. CodeWall's position is that the access was real regardless of what data they chose to read.

The Technical Breakdown: SQL Injection in 2026

The vulnerability that allowed this was not sophisticated. It was SQL injection — one of the oldest, most well-documented attack classes in existence. OWASP has listed SQL injection as a critical vulnerability for over two decades. And it's what brought down McKinsey's internal AI platform in 2026.

Here is the exact attack chain. CodeWall's agent first accessed Lilli's publicly available technical documentation, which listed more than 200 API endpoints. Of those 200 endpoints, 22 required no authentication whatsoever. The agent found these by crawling the documentation automatically.

One of the unauthenticated endpoints accepted user search queries and passed them directly to the database without sanitising the input. The agent submitted a crafted query that SQL injection exploited. From there, it escalated from read access to full read-write access on the production database.

The critical detail: Lilli's 95 internal system prompts — the instructions that govern how the AI assistant responds to users — were stored in that same database. The agent could have altered any of them without deploying new code or triggering standard security monitoring. It could have turned McKinsey's internal AI assistant into a tool that steered consultants toward specific conclusions, financial recommendations, or client advice, with zero visible footprint.

Why Autonomous Agents Make This Worse

SQL injection has existed as a threat since the late 1990s. What's new is the attack vector. A human penetration tester could have found this vulnerability — but it would have taken days of manual reconnaissance, careful enumeration of endpoints, and deliberate crafting of injection payloads. The entire process would have been slow, traceable, and limited by human attention.

An autonomous AI agent does this in two hours because it can enumerate 200 endpoints simultaneously, test each one for injection points in parallel, and escalate automatically when it finds a working exploit — all without sleeping, losing focus, or making the mistakes that human attackers make. The agent doesn't get tired. It doesn't accidentally trigger an alert by clicking the wrong thing. It just works through the attack surface systematically until it finds a path.

This is the practical meaning of "autonomous agentic AI" applied to offensive security. The same capabilities that make AI agents useful for customer service automation, coding assistance, and enterprise workflows also make them highly effective attack tools when pointed at vulnerable systems.

What Was Actually at Risk

The 46.5 million messages represent something more sensitive than user data. McKinsey is one of the three largest strategy consulting firms in the world. Its consultants work with CEOs, governments, and boards on their most sensitive decisions — mergers, restructurings, layoffs, market entries, regulatory strategy. Those conversations happened inside Lilli.

The 384,000 AI assistants represent customised Lilli deployments — internal tools built for specific engagements or practice areas. The system prompts governing those assistants define how the AI frames analysis, what sources it trusts, and how it structures recommendations. With write access to those prompts, an attacker could have silently altered the AI's behaviour on live engagements.

This is not hypothetical. The access was real, the write capability was confirmed, and the only reason the prompts weren't altered is that CodeWall chose not to alter them. A malicious actor with the same access and different motivations would have had full capability to do so.

Enterprise AI Security: What This Exposes

The McKinsey breach exposes a systematic gap in how enterprises are deploying AI platforms. Security teams are familiar with web application security — they test for SQL injection, XSS, and CSRF in customer-facing applications. But internal AI platforms are being built and deployed with far less rigour, often by teams focused on AI capability rather than security architecture.

The specific failure modes here are common across enterprise AI deployments. Publicly exposed API documentation that maps the attack surface. Unauthenticated endpoints left open for internal convenience. User input passed directly to database queries without sanitisation. Sensitive operational data — including system prompts — stored in the same database as user data.

Each of these failures is basic. None of them required a sophisticated attacker to exploit. The only new element is that the attacker was an autonomous AI agent that could enumerate and exploit them faster than any human team could respond.

What Developers Building Enterprise AI Need to Do

If you're building or maintaining an internal AI platform — whether that's a custom LLM deployment, a RAG system, or an agentic workflow — this breach is a direct checklist for your security review.

First: audit every endpoint for authentication. If an endpoint accepts external input, it requires authentication. There is no valid reason for a search endpoint to be unauthenticated in a production system. Second: treat all user input as untrusted. Parameterised queries, input validation, and output encoding are not optional. SQL injection in 2026 is a failure of basic hygiene, not a sophisticated attack. Third: separate your system prompts from your user data. Store them in a separate system with separate access controls and separate audit logging. If an attacker gets read-write access to your user database, they should not automatically get write access to your AI's instructions. Fourth: log agent interactions differently from human interactions. Autonomous agent traffic patterns look different from human traffic — higher request rates, systematic enumeration, parallel calls. Build detection rules for this pattern before someone else demonstrates it to you.

Key Takeaways

CodeWall's autonomous AI agent breached McKinsey's Lilli platform in 2 hours with no credentials, no insider access
46.5 million messages, 728K files, 57K user accounts were accessible — live production data from global consulting operations
SQL injection via an unauthenticated endpoint was the entry point — one of the oldest known vulnerability classes
95 system prompts governing Lilli's AI behaviour were in the same database and could have been silently altered
Autonomous agents accelerate attacks — what took human pen testers days now takes hours without human intervention
Enterprise AI platforms are being deployed with web app security gaps — internal tools face the same threats as customer-facing systems
Fix the basics: authenticated endpoints, parameterised queries, separated system prompt storage, agent traffic detection

FAQ

Frequently Asked Questions

What is McKinsey's Lilli platform and what data was exposed?

Lilli is McKinsey's internal AI assistant platform used across its global consulting operations for strategy, M&A analysis, and client work. In the CodeWall breach on February 28, 2026, the autonomous agent accessed 46.5 million chat messages, 728,000 files, 57,000 user accounts, 384,000 AI assistants, and 94,000 workspaces. It also had write access to Lilli's 95 internal system prompts — the instructions governing how the AI assistant behaves.

How did the AI agent hack McKinsey's platform?

CodeWall's autonomous AI agent started by crawling Lilli's publicly available API documentation, which listed over 200 endpoints. It found that 22 required no authentication. One unauthenticated endpoint passed user search queries directly to the database without sanitising input, making it vulnerable to SQL injection. The agent exploited this to gain full read-write access to the production database. The entire process took under two hours with no human guidance.

Why are autonomous AI agents more dangerous than human hackers for this type of attack?

Autonomous AI agents can enumerate hundreds of API endpoints in parallel, test each for vulnerabilities simultaneously, and escalate access automatically when they find an exploit — all without sleeping, losing focus, or making human errors. A process that would take a human penetration tester days of careful manual work takes an autonomous agent hours. As agentic AI becomes more capable, this attack speed advantage grows.

What should developers do to protect enterprise AI platforms from this type of attack?

Four immediate actions: audit every endpoint for authentication and require auth on anything that accepts external input; treat all user input as untrusted using parameterised queries and input validation; store system prompts separately from user data with separate access controls; and build detection rules for autonomous agent traffic patterns, which look different from human traffic in terms of request rate and systematic enumeration.

Did McKinsey confirm the breach?

McKinsey acknowledged the security research and said a leading third-party forensics firm found no evidence that client data or client confidential information was accessed by CodeWall or any other unauthorised third party. CodeWall confirmed the access was real — they chose not to read or exfiltrate the data they had access to. The debate centres on whether absence of exfiltration evidence is equivalent to absence of access.

Free Weekly Briefing

The AI & Dev Briefing

One honest email a week — what actually matters in AI and software engineering. No noise, no sponsored content. Read by developers across 30+ countries.

No spam. Unsubscribe anytime.