Security
The $47,000 AI Agent Loop: What Happens When Agents Run Without Guardrails
A single LangChain agent in an infinite loop burned through $47,000 before anyone noticed. Here's what went wrong, why it keeps happening, and how to prevent it.
Content:
A $47,000 mistake nobody saw coming
In late 2024, a team deployed a LangChain agent to automate support ticket classification. Standard setup — GPT-4, a retrieval chain, a loop that retried on failures. The agent hit an edge case, got stuck, and started calling OpenAI in a tight loop.
By the time someone checked the dashboard the next morning, the bill was $47,000.
No alert fired. No circuit breaker tripped. No kill switch existed. The agent just kept running, burning tokens on the same failed classification, over and over.
This isn't rare. It's the default.
The $47K incident made the rounds on developer forums, but it's not an outlier. It's the predictable outcome of how agents are built today.
Most AI agent frameworks give you powerful abstractions — tool calling, chain-of-thought, multi-step reasoning, autonomous loops. What none of them ship out of the box is a way to say "stop spending money."
Consider this: 81% of AI agents in production have zero cost governance.
That number comes from a 2026 industry report on AI agent deployments. Four out of five production agents have no budget cap, no spend alert, and no emergency stop.
Why agents loop
Agent loops aren't bugs in the traditional sense. They're emergent behavior from systems designed to retry and self-correct.
Here's what typically goes wrong:
Retry logic without a ceiling. The agent fails, retries, fails again, retries forever. Each retry costs tokens.
Tool call cycles. The agent calls a tool, gets an unexpected response, calls it again hoping for a different result.
Self-correction spirals. The agent detects its own output is wrong, tries to fix it, makes it worse, tries again.
Context window stuffing. Each iteration adds to the context, which increases token count per call, which accelerates the cost.
A single GPT-4 call costs a few cents. A thousand of them in ten minutes costs hundreds of dollars. Ten thousand overnight costs tens of thousands.
The framework gap
Let's be direct about what the major frameworks provide:
Framework | Budget limits | Kill switch | Spend alerts |
|---|---|---|---|
LangChain / LangGraph | No | No | No |
CrewAI | No | No | No |
OpenAI Agents SDK | No | No | No |
Vercel AI SDK | No | No | No |
AutoGen | No | No | No |
Every framework on this list gives you the ability to build agents that spend unlimited money. None of them give you the ability to stop them.
This isn't a criticism of these frameworks — they solve different problems. But it means the responsibility falls entirely on you, the developer, to build cost controls from scratch.
Most teams don't. Not because they don't care, but because the first version of any agent is focused on making it work, not making it safe. Guardrails get added after the first incident.
What $47,000 buys you
To put the number in perspective:
$47,000 is roughly 470 million GPT-4 input tokens, or 23 million output tokens
It's a senior engineer's salary for three months
It's enough to run a well-governed fleet of agents for an entire year
All of it gone in a single overnight loop. No value generated. No tickets classified. Just tokens burned.
How to prevent it
The fix isn't complicated. The fix is adding a cost ceiling that your agent cannot exceed, enforced at the level where LLM calls actually happen.
Here's what that looks like in practice:
Three lines. The agent now has a hard cap — $5 per run, $50 per day. If either limit is hit, execution stops immediately. Not after the next polling interval. Not when someone checks a dashboard. Immediately, before the next token.
For teams that don't want to touch their agent code at all, a proxy approach works too:
Zero code changes. Every OpenAI call now goes through a local proxy that enforces the budget.
The real lesson
The team that lost $47,000 didn't have bad engineers. They had a standard production deployment with no cost controls — which, today, is the norm.
The question isn't whether your agent will hit an edge case. It's whether you'll have a kill switch when it does.
