Another AI data loss story surfaced this week. This time the founder documented everything. The lessons apply to anyone giving AI agents access to infrastructure.
Hi, my name is Tom Smykowski, I'm a staff full-stack engineer. I build and scale SaaS platforms to millions of users, working end-to-end from system architecture to frontend to mobile. On this blog I share infrastructure and security lessons from real incidents.
It's nothing new, actually. Just another story from the series: AI wiped my data. In this case Jer Crane, founder of PocketOS, published detailed info on how it happened, so we can learn from it.
PocketOS is a platform that turns Notion into a personal operating system. In his X thread, Jer wrote that his agent had to do some task on staging but ran into credential problems. So it searched locally, found a Railway token, and used it to detach a volume. A volume where production data was. And backups.
The system rule said never run destructive or irreversible git commands. But Railway CLI isn't Git. The rule was too narrow.
Railway, if you don't know it, is a deployment platform competing with Heroku, Render, and Fly.io. Developers like it because it abstracts away Kubernetes complexity.
Jer enumerated what failed. Railway didn't confirm the deletion. CLI tokens have blanket permissions. Backups were in the same place as production. After 30 hours, still no recovery info from Railway.
Such a turn of events is devastating. Especially when you have users, all data gets wiped, and you can't get answers on recovery.
There are lessons here. The full article covers 8 specific practices I've developed from scaling SaaS platforms, including how to structure environment separation, token scoping, middleware guards that sit between AI and your infrastructure, and backup strategies that actually protect you when the agent goes rogue.
