Industry

Apr 3, 2026

AI Agent Frameworks Don't Ship Guardrails. Here's Why That's a Problem.

LangChain, CrewAI, OpenAI Agents SDK, Vercel AI SDK — none of them ship budget limits or kill switches. We looked at why, and what it means for production agents.

Alexander Lundberg

Lead Product

Alexander Lundberg

The frameworks are incredible. The gap is obvious.

The current generation of AI agent frameworks is genuinely impressive. LangChain gives you composable chains and tool calling. CrewAI lets you orchestrate multi-agent teams. OpenAI's Agents SDK provides a clean abstraction for stateful agents. Vercel AI SDK makes streaming and tool use feel native in TypeScript.

What none of them do is help you control what your agent spends.

This isn't an edge case. It's the single most common production failure mode for autonomous agents — and the frameworks have decided it's not their problem.

What the frameworks ship

Let's look at what each framework gives you today:

Capability	LangChain	CrewAI	OpenAI Agents	Vercel AI
Tool calling	Yes	Yes	Yes	Yes
Multi-step reasoning	Yes	Yes	Yes	Yes
Streaming	Yes	Yes	Yes	Yes
Memory / state	Yes	Yes	Yes	Partial
Retries	Yes	Yes	Yes	Yes
Budget limits	No	No	No	No
Kill switches	No	No	No	No
Spend alerts	No	No	No	No
Cost tracking	No	No	No	No
Approval gates	No	No	No	No

Every framework gives you the tools to build an agent that can spend unlimited money autonomously. None of them give you a way to stop it.

Why frameworks don't build this

There are legitimate reasons this gap exists, and it's worth understanding them before getting frustrated.

1. Frameworks are provider-agnostic by design.

Budget enforcement requires knowing the cost per token for every model across every provider. That's a moving target — pricing changes, new models launch, and different providers use different billing units. Frameworks don't want to maintain a cost database that's always one update behind.

2. Retry logic is a feature, not a bug.

Frameworks are designed to be resilient. If a call fails, retry. If the output is wrong, self-correct. This makes agents robust — but it also makes them expensive when things go sideways. The same retry logic that recovers from transient errors is what creates $47,000 infinite loops.

3. Budget limits are runtime enforcement, not orchestration.

Frameworks think in terms of chains, graphs, and tool calls. Budget limits are a different concern — they need to intercept the actual HTTP call to the LLM provider, count tokens, look up pricing, and enforce a cap. That's a lower layer than most frameworks operate at.

4. Nobody wants to be the "slow" framework.

Adding budget checks to every LLM call adds code path complexity. Framework maintainers worry about latency, about breaking changes, about the support burden. Easier to leave it out and let users handle it.

These are all understandable reasons. But the result is the same: developers deploy agents to production with zero cost controls, and the framework says that's fine.

What this looks like in production

Here's a standard LangGraph agent deployed to production:

from langgraph.graph import StateGraph

graph = StateGraph(AgentState)
graph.add_node("research", research_node)
graph.add_node("write", write_node)
graph.add_node("review", review_node)
graph.add_edge("research", "write")
graph.add_edge("write", "review")
graph.add_conditional_edges("review", should_revise)

app = graph.compile()
app.invoke({"task": "Write a market analysis report"})

This agent researches, writes, reviews, and revises. The should_revise function can send it back through the loop multiple times. Each loop calls the LLM again. There's no limit on how many times it loops. There's no budget cap. There's no way to say "stop after $10."

Now imagine this running as a batch job overnight on 500 tasks.

The 81% problem

A 2026 report on AI agent deployments found that 81% of production agents have zero cost governance. Only 14.4% are fully governed with budget limits, monitoring, and access controls.

That's not a developer skills problem. It's an ecosystem problem. When the frameworks don't provide guardrails, and the LLM providers don't provide guardrails, and the agent runs autonomously — who's enforcing the budget?

Nobody. Until the bill arrives.

What governance actually requires

Proper agent governance needs five things:

1. Budget enforcement — hard cost ceilings that stop the agent mid-execution, not after the run completes.

2. Kill switches — the ability to halt an agent immediately, not gracefully, not after the current step, now.

3. Cost attribution — knowing which agent, which run, which specific LLM call cost what.

4. Approval gates — requiring human sign-off before the agent takes high-risk actions like database writes, API calls to external services, or financial transactions.

5. Traces — a full record of every decision the agent made, every tool it called, and every dollar it spent.

None of the major frameworks provide any of these five. Not one.

Where this enforcement should live

There are three possible layers for agent governance:

Network layer (API gateways like Portkey, Helicone) — these see HTTP requests to LLM providers. They can rate-limit, cache, and log. But they can't see agent context. They don't know that the same "agent" made three calls to OpenAI and two to Anthropic as part of one task. They can't enforce a cross-provider budget for a single run.

Framework layer (LangChain, CrewAI) — these see the orchestration logic. They could theoretically add budget tracking, but they'd need to maintain cost-per-token tables for every model and handle enforcement at the LLM call level, which is below their abstraction.

Runtime layer (inside the agent process) — this is where you have full context. You know which agent is running, what its budget is, how much it's spent across all providers, and you can kill it before the next token. This is the layer that's missing.

The uncomfortable conclusion

AI agent frameworks are not going to solve this for you. Not because they can't, but because it's outside their scope. They build orchestration. Governance is a different layer.

That layer needs to exist. It needs to intercept every LLM call regardless of provider. It needs to track cost in real time. It needs to enforce hard limits. And it needs to do it without requiring you to rewrite your agent.

That's why we built Wickd. But regardless of what tool you use — build something. Your agents are spending money autonomously. Somebody should be watching.

Blog

Updates

About