An AI Agent Deleted a Production Database in 9 Seconds

Written by Mike Kaput | May 5, 2026 1:30:00 PM

In Brief

PocketOS founder Jer Crane published a viral postmortem describing how a Cursor agent running Anthropic's flagship Claude Opus 4.6 deleted his company's entire production database, plus all backups, in nine seconds.

The agent then wrote a confession listing every safety rule it had broken. Two layers of guardrails failed at the same time. The story is rippling through enterprise AI for a reason: Nobody is sure how to prevent it from happening to them.

What Happened

A Cursor coding agent powered by Anthropic's Claude Opus 4.6 deleted PocketOS's entire production database and all volume-level backups in nine seconds. PocketOS is a car software company. According to founder Jer Crane's viral postmortem on X, the agent was working on a routine task in a test environment when it hit a credentials problem and decided, on its own, to "fix" it by deleting a Railway volume, a permanent storage feature that survives restarts.

The agent went looking for an access key. It found one in a file unrelated to the task, and used it to issue a single command that wiped the data. There was no confirmation prompt, no warning, and no check that the data being deleted was production rather than test. The token had been created for a small, specific job, but it carried full account-wide permissions, including destructive operations.

When Crane asked the agent to explain itself, it produced what he called a written confession: "I guessed instead of verifying. I ran a destructive action without being asked. I didn't understand what I was doing before doing it." The rules the agent referenced matched both Cursor's published guardrails and PocketOS's internal safety instructions. Both failed at the same time. Simon Willison's analysisargued the real lessons sit at the infrastructure layer, not the model layer.

On Episode 212 of The Artificial Intelligence Show, SmarterX founder and CEO Paul Roetzer broke down what this story actually says about the gap between building agents and running them safely.

The Key Numbers

9 - seconds it took an AI agent to delete PocketOS's entire production database

3 months - age of the most recent recoverable backup of the deleted database because Railway stores volume-level backups inside the same volume that was wiped

1 - API token, created for routine domain operations, that carried blanket authority across the entire Railway GraphQL API and led to deleting the database

Why Building Agents and Running Them Safely Are Two Different Skills

Roetzer shared the PocketOS story on stage during a keynote in Denver last week. The reaction was immediate.

"There's just audible gasps from the audience," says Roetzer. "The air came out of the room. These were smart, technically-minded people, so they knew what that meant."

It's gotten easier to build agents but operating them safely has not.

Vibe coding is not the same as production engineering. "We can all vibe code apps and agents," says Roetzer. "We don't all have the knowledge of how to move these tools into production and then safely govern them in public domain. Just because you can build something doesn't mean you all of a sudden are also an expert in how to take them live, especially when they start collecting payment data and customer data."

The model was not the weak link. PocketOS was running the flagship model, configured with explicit safety rules, integrated through the most-marketed AI coding tool in the category. The setup was exactly what vendors tell developers to do. It still deleted production data. The actual failure was an over-permissioned token sitting in a file the agent could read, paired with an infrastructure provider whose backup model collapsed when the volume did.

Two safety layers can fail at the same time. Cursor's published guardrails and PocketOS's internal safety instructions both told the agent not to do exactly what it did. That's what enterprise teams should note: The layers most companies are counting on are the same layers that failed.

"I think they actually knew what they were doing. And it still happened."

— Paul Roetzer, founder and CEO of SmarterX, on Episode 212 of The Artificial Intelligence Show

SmarterX Take

The dangerous idea moving through enterprise AI right now is that because vibe coding works for prototypes, it should work for production. It does not. Building a working app or agent and operating one safely against real customer data are different jobs that require different expertise. Treating them as the same is how you end up with a deleted database.

The pragmatic path looks like Roetzer's: use vibe coding for minimum viable products and prototypes, then hand the work to technical partners who know how to take something live safely.

"Rather than me spending months on a creative brief and saying, 'Here's what I want it to do and here are examples of apps,' I'm just going to go build a sample app, and then I'm going to take it to my technical partners and say, 'Can you build this for me safely and help us get it into the public domain?'" says Roetzer.

What to Watch

Token permissions become the next audit target. The PocketOS failure was less about a rogue model and more about an API token that should never have had account-wide destructive privileges. Expect security teams to start scrutinizing every token an agent can reach.

Infrastructure providers will get pulled into the safety conversation. Railway storing volume-level backups inside the same volume is the kind of design choice that was fine in a human-paced world but is indefensible in an agent-paced one. Vendors that touch agent workflows will face pressure to redesign defaults around the assumption that an autonomous system might issue the destructive command.

The build-versus-operate split becomes a real role. Companies that survive the next wave will separate the people who prototype with agents from the people who put them into production. That separation barely exists today.