An AI Agent Broke Into McKinsey's Internal Platform in Under Two Hours

Written by Mike Kaput | Mar 17, 2026 1:30:00 PM

In Brief

A small security startup used an autonomous AI agent to gain access to McKinsey's internal AI chatbot and full read-write access to its production database in under two hours.

The breach exposed 46.5 million plaintext chat messages, 57,000 user accounts, and 95 writable system prompts that control advice for 40,000 consultants. The implications for every enterprise deploying AI are enormous.

What Happened

CodeWall, a security startup founded by Paul Price, used an autonomous AI agent to gain access into McKinsey's internal AI chatbot "Lilli". This allowed full read-write access to the production database in under two hours. Lilli was launched in July 2023, is used by over 70% of McKinsey's 40,000+ employees, and processes more than 500,000 prompts per month. The entire process was, in Price's words, "fully autonomous from researching the target, analyzing, attacking, and reporting."

The vulnerability was embarrassingly basic in security terms. Standard security tools missed it entirely. The AI agent did not.

The scope of what was exposed is significant. The agent extracted 46.5 million unencrypted, plaintext chat messages containing strategy, mergers and acquisitions information, and client engagement data. It also accessed 57,000 user accounts, 384,000 AI assistants, 94,000 workspaces, and more. CodeWall also claimed access to 728,000 files containing confidential client data, though a source close to McKinsey told the Financial Times that only the file names were accessed and the actual files were stored separately.

The agent also gained write access that could have allowed a real attacker to silently rewrite the guardrails controlling what Lilli tells consultants. In other words, it could have poisoned the advice generated across every engagement for tens of thousands of users, without triggering standard security monitoring. That could have turned McKinsey's trusted internal advisor into a vehicle for disinformation or sabotage, with no audit trail.

McKinsey was able to fix all of the vulnerabilities and a third-party forensics investigation found no evidence of unauthorized access by any party other than the intended CodeWall.

SmarterX and Marketing AI Institute founder and CEO Paul Roetzer broke down why this story matters far beyond McKinsey on Episode 203 of The Artificial Intelligence Show.

The Key Numbers

46.5 million - Plaintext chat messages exposed, covering strategy, M&A, and client engagements

57,000 - User accounts accessed

95 - Writable system prompts controlling Lilli's behavior across 12 model types

728,000 - Files CodeWall claims to have reached (McKinsey disputes the extent)

Under 2 hours - The breach from start to full read-write access, fully autonomously

Why This Story is Important

The story got coverage, but not nearly enough. Roetzer was struck by how muted the response was given the scope of what was exposed. "I was really surprised by the limited coverage of this," he says. "CIO has an article and there's a couple others, but I was shocked. This just seems like a really big deal on a couple of levels."

The breach reached far and wide into McKinsey. The Financial Times described what was accessed as "the full organizational structure of how the firm uses AI internally" and the "firm's intellectual crown jewels." Roetzer put it in terms any business leader can understand: "It's the weights of McKinsey. Someone got the weights of one of the key AI models. They did that for one of the most influential consulting firms in the world. They basically got everything."

This was a friendly hack. CodeWall was conducting an intended exercise, not a malicious attack. "This was a friendly hack," Roetzer says. meant to intentionally disclose vulnerabilities without doing harm. A malicious attacker would have exploited them.

The real risk sits in the infrastructure, not the model. Roetzer highlighted analysis from cybersecurity firm Salt that reframed the entire incident. Their conclusion: "An AI agent didn't hack McKinsey. Its exposed APIs did." Salt wrote that "too many organizations are still thinking about AI security at the model layer, while the real enterprise risks sit in the action layer, the APIs, MCP servers, internal services and shadow integrations that AI agents can reach, invoke and manipulate. This is the part companies still do not see."

"This is why if you're in a big enterprise and you're frustrated by how slow things are moving with AI adoption, this is the cautionary tale of why.

"Why it moves slow, why legal moves slow. Even if you know what you're doing, you still are opening yourself up to all kinds of known and unknown risks."

—Paul Roetzer, founder and CEO of SmarterX

Roetzer also drew a direct line to broader government concerns about AI security. "This also touches on the connection to government concerns around model infiltration and what are the unknowns as we start using these different models," he says.

SmarterX Take

The lesson is clear in Roetzer's summary of the Salt analysis:

"Whether or not every possible impact was realized, the takeaway for security leaders is clear: When internal AI systems are wired into weakly governed APIs, the blast radius can become enormous very quickly," Roetzer says.

In other words, companies are pouring resources into making their models safe while leaving the plumbing wide open. As autonomous AI agents become more capable, the window between "theoretically vulnerable" and "actively exploited" is closing fast.

For non-technical leaders, Roetzer's advice is direct:

"This is why your technical peers are very cautious, rightfully so. And why we often advise the non-technical people focus on the AI use cases that don't have to touch the data."

—Paul Roetzer, founder and CEO of SmarterX

What to Watch

The race to deploy AI agents is accelerating across every industry. CodeWall's CEO warned that malicious hackers will soon use AI agents to indiscriminately attack targets for financial blackmail and ransomware. The tools to execute sophisticated attacks have become accessible to anyone.

Watch whether enterprise security teams start auditing their connections between computers (APIs) with the same intensity they apply to their AI models. The McKinsey breach exposed the gap between where companies focus their security attention and where the actual vulnerabilities live. That gap is where the next breach will come from.