Security

Apr 8, 2026

Apr 8, 2026

The $47,000 AI Agent Loop: What Happens When Agents Run Without Guardrails

A single LangChain agent in an infinite loop burned through $47,000 before anyone noticed. Here's what went wrong, why it keeps happening, and how to prevent it.

image of Grace

Alexander Lundberg

Lead Product

image of Grace

Alexander Lundberg

Content:

A $47,000 mistake nobody saw coming

In late 2024, a team deployed a LangChain agent to automate support ticket classification. Standard setup — GPT-4, a retrieval chain, a loop that retried on failures. The agent hit an edge case, got stuck, and started calling OpenAI in a tight loop.

By the time someone checked the dashboard the next morning, the bill was $47,000.

No alert fired. No circuit breaker tripped. No kill switch existed. The agent just kept running, burning tokens on the same failed classification, over and over.

This isn't rare. It's the default.

The $47K incident made the rounds on developer forums, but it's not an outlier. It's the predictable outcome of how agents are built today.

Most AI agent frameworks give you powerful abstractions — tool calling, chain-of-thought, multi-step reasoning, autonomous loops. What none of them ship out of the box is a way to say "stop spending money."

Consider this: 81% of AI agents in production have zero cost governance.

That number comes from a 2026 industry report on AI agent deployments. Four out of five production agents have no budget cap, no spend alert, and no emergency stop.

Why agents loop

Agent loops aren't bugs in the traditional sense. They're emergent behavior from systems designed to retry and self-correct.

Here's what typically goes wrong:

  • Retry logic without a ceiling. The agent fails, retries, fails again, retries forever. Each retry costs tokens.

  • Tool call cycles. The agent calls a tool, gets an unexpected response, calls it again hoping for a different result.

  • Self-correction spirals. The agent detects its own output is wrong, tries to fix it, makes it worse, tries again.

  • Context window stuffing. Each iteration adds to the context, which increases token count per call, which accelerates the cost.

A single GPT-4 call costs a few cents. A thousand of them in ten minutes costs hundreds of dollars. Ten thousand overnight costs tens of thousands.

The framework gap

Let's be direct about what the major frameworks provide:

Framework

Budget limits

Kill switch

Spend alerts

LangChain / LangGraph

No

No

No

CrewAI

No

No

No

OpenAI Agents SDK

No

No

No

Vercel AI SDK

No

No

No

AutoGen

No

No

No

Every framework on this list gives you the ability to build agents that spend unlimited money. None of them give you the ability to stop them.

This isn't a criticism of these frameworks — they solve different problems. But it means the responsibility falls entirely on you, the developer, to build cost controls from scratch.

Most teams don't. Not because they don't care, but because the first version of any agent is focused on making it work, not making it safe. Guardrails get added after the first incident.

What $47,000 buys you

To put the number in perspective:

  • $47,000 is roughly 470 million GPT-4 input tokens, or 23 million output tokens

  • It's a senior engineer's salary for three months

  • It's enough to run a well-governed fleet of agents for an entire year

All of it gone in a single overnight loop. No value generated. No tickets classified. Just tokens burned.

How to prevent it

The fix isn't complicated. The fix is adding a cost ceiling that your agent cannot exceed, enforced at the level where LLM calls actually happen.

Here's what that looks like in practice:

import wickd

@wickd.agent(budget=wickd.Budget(per_run=5.00, daily=50.00))
def classify_tickets(batch):
    # Your existing agent code, unchanged
    ...

Three lines. The agent now has a hard cap — $5 per run, $50 per day. If either limit is hit, execution stops immediately. Not after the next polling interval. Not when someone checks a dashboard. Immediately, before the next token.

[wickd] classify_tickets | BUDGET EXCEEDED | $5.01/$5.00 | 847 calls | killed

For teams that don't want to touch their agent code at all, a proxy approach works too:

wickd start --budget-per-run 5.00 --budget-daily 50.00
export OPENAI_BASE_URL

Zero code changes. Every OpenAI call now goes through a local proxy that enforces the budget.

The real lesson

The team that lost $47,000 didn't have bad engineers. They had a standard production deployment with no cost controls — which, today, is the norm.

The question isn't whether your agent will hit an edge case. It's whether you'll have a kill switch when it does.