AI Agent Hacked McKinsey's Platform in 2 Hours: 46 Million Messages Exposed

Abhishek Gautam··9 min read

Quick summary

CodeWall's autonomous AI agent breached McKinsey's internal Lilli platform via SQL injection with no credentials. 46.5 million messages, 728K files, and system prompts exposed.

An autonomous AI agent broke into McKinsey's internal AI platform in two hours. No credentials. No insider help. No human guidance. Just an agent pointed at a public URL, and two hours later it had read-write access to 46.5 million internal chat messages. This happened on February 28, 2026, and the security community has been quietly absorbing the implications ever since.

What Happened and How Fast

Security startup CodeWall disclosed the breach publicly in early March 2026. The target was McKinsey's internal AI platform, called Lilli — the firm's proprietary AI assistant used across its global consulting operations for strategy work, M&A analysis, and client engagements.

CodeWall's autonomous agent started with nothing but a publicly accessible URL. It found its way in through a vulnerability chain that took under two hours to exploit from first contact to full production database access. The agent operated without any human direction during the attack — it found the vulnerability, constructed the exploit, and escalated its own access autonomously.

The scale of what it accessed: 46.5 million chat messages, 728,000 files, 57,000 user accounts, 384,000 AI assistants, and 94,000 workspaces. These weren't test accounts or demo data — they were live production records from McKinsey's global consulting operations.

McKinsey acknowledged the breach and said a third-party forensics firm found no evidence that actual client data was accessed by CodeWall or any other unauthorised party. CodeWall's position is that the access was real regardless of what data they chose to read.

The Technical Breakdown: SQL Injection in 2026

The vulnerability that allowed this was not sophisticated. It was SQL injection — one of the oldest, most well-documented attack classes in existence. OWASP has listed SQL injection as a critical vulnerability for over two decades. And it's what brought down McKinsey's internal AI platform in 2026.

Here is the exact attack chain. CodeWall's agent first accessed Lilli's publicly available technical documentation, which listed more than 200 API endpoints. Of those 200 endpoints, 22 required no authentication whatsoever. The agent found these by crawling the documentation automatically.

One of the unauthenticated endpoints accepted user search queries and passed them directly to the database without sanitising the input. The agent submitted a crafted query that SQL injection exploited. From there, it escalated from read access to full read-write access on the production database.

The critical detail: Lilli's 95 internal system prompts — the instructions that govern how the AI assistant responds to users — were stored in that same database. The agent could have altered any of them without deploying new code or triggering standard security monitoring. It could have turned McKinsey's internal AI assistant into a tool that steered consultants toward specific conclusions, financial recommendations, or client advice, with zero visible footprint.

Why Autonomous Agents Make This Worse

SQL injection has existed as a threat since the late 1990s. What's new is the attack vector. A human penetration tester could have found this vulnerability — but it would have taken days of manual reconnaissance, careful enumeration of endpoints, and deliberate crafting of injection payloads. The entire process would have been slow, traceable, and limited by human attention.

An autonomous AI agent does this in two hours because it can enumerate 200 endpoints simultaneously, test each one for injection points in parallel, and escalate automatically when it finds a working exploit — all without sleeping, losing focus, or making the mistakes that human attackers make. The agent doesn't get tired. It doesn't accidentally trigger an alert by clicking the wrong thing. It just works through the attack surface systematically until it finds a path.

This is the practical meaning of "autonomous agentic AI" applied to offensive security. The same capabilities that make AI agents useful for customer service automation, coding assistance, and enterprise workflows also make them highly effective attack tools when pointed at vulnerable systems.

What Was Actually at Risk

The 46.5 million messages represent something more sensitive than user data. McKinsey is one of the three largest strategy consulting firms in the world. Its consultants work with CEOs, governments, and boards on their most sensitive decisions — mergers, restructurings, layoffs, market entries, regulatory strategy. Those conversations happened inside Lilli.

The 384,000 AI assistants represent customised Lilli deployments — internal tools built for specific engagements or practice areas. The system prompts governing those assistants define how the AI frames analysis, what sources it trusts, and how it structures recommendations. With write access to those prompts, an attacker could have silently altered the AI's behaviour on live engagements.

This is not hypothetical. The access was real, the write capability was confirmed, and the only reason the prompts weren't altered is that CodeWall chose not to alter them. A malicious actor with the same access and different motivations would have had full capability to do so.

Enterprise AI Security: What This Exposes

The McKinsey breach exposes a systematic gap in how enterprises are deploying AI platforms. Security teams are familiar with web application security — they test for SQL injection, XSS, and CSRF in customer-facing applications. But internal AI platforms are being built and deployed with far less rigour, often by teams focused on AI capability rather than security architecture.

The specific failure modes here are common across enterprise AI deployments. Publicly exposed API documentation that maps the attack surface. Unauthenticated endpoints left open for internal convenience. User input passed directly to database queries without sanitisation. Sensitive operational data — including system prompts — stored in the same database as user data.

Each of these failures is basic. None of them required a sophisticated attacker to exploit. The only new element is that the attacker was an autonomous AI agent that could enumerate and exploit them faster than any human team could respond.

What Developers Building Enterprise AI Need to Do

If you're building or maintaining an internal AI platform — whether that's a custom LLM deployment, a RAG system, or an agentic workflow — this breach is a direct checklist for your security review.

First: audit every endpoint for authentication. If an endpoint accepts external input, it requires authentication. There is no valid reason for a search endpoint to be unauthenticated in a production system. Second: treat all user input as untrusted. Parameterised queries, input validation, and output encoding are not optional. SQL injection in 2026 is a failure of basic hygiene, not a sophisticated attack. Third: separate your system prompts from your user data. Store them in a separate system with separate access controls and separate audit logging. If an attacker gets read-write access to your user database, they should not automatically get write access to your AI's instructions. Fourth: log agent interactions differently from human interactions. Autonomous agent traffic patterns look different from human traffic — higher request rates, systematic enumeration, parallel calls. Build detection rules for this pattern before someone else demonstrates it to you.

Key Takeaways

  • CodeWall's autonomous AI agent breached McKinsey's Lilli platform in 2 hours with no credentials, no insider access
  • 46.5 million messages, 728K files, 57K user accounts were accessible — live production data from global consulting operations
  • SQL injection via an unauthenticated endpoint was the entry point — one of the oldest known vulnerability classes
  • 95 system prompts governing Lilli's AI behaviour were in the same database and could have been silently altered
  • Autonomous agents accelerate attacks — what took human pen testers days now takes hours without human intervention
  • Enterprise AI platforms are being deployed with web app security gaps — internal tools face the same threats as customer-facing systems
  • Fix the basics: authenticated endpoints, parameterised queries, separated system prompt storage, agent traffic detection

Free Weekly Briefing

The AI & Dev Briefing

One honest email a week — what actually matters in AI and software engineering. No noise, no sponsored content. Read by developers across 30+ countries.

No spam. Unsubscribe anytime.

Free Tool

Will AI replace your job?

4 questions. Get a personalised developer risk score based on your stack, role, and what you actually build day to day.

Check Your AI Risk Score →
ShareX / TwitterLinkedIn

Written by

Abhishek Gautam

Full Stack Developer & Software Engineer based in Delhi, India. Building web applications and SaaS products with React, Next.js, Node.js, and TypeScript. 8+ projects deployed across 7+ countries.