Tutorials

Apr 4, 2026

Apr 4, 2026

How to Add Budget Limits to Any AI Agent in 5 Minutes

Your AI agent has no spending cap. Here's how to add budget limits, kill switches, and cost tracking to any Python or TypeScript agent in under 5 minutes.

iamge of Tomasz

James Okoye

Lead Engineer

iamge of Tomasz

James Okoye

The problem in 30 seconds

You built an agent. It calls GPT-4, Claude, or Gemini. It works. You deploy it. And now it can spend unlimited money with zero oversight.

There's no built-in way to say "stop at $5" in OpenAI's SDK. Or Anthropic's. Or Google's. And no agent framework — LangChain, CrewAI, Vercel AI SDK — adds one for you.

This tutorial fixes that. By the end, your agent will have hard cost ceilings enforced on every single LLM call.

What you'll get

  • A per-run budget cap (e.g. kill this run if it exceeds $5)

  • A daily budget cap (e.g. kill all runs if today's total exceeds $50)

  • Real-time cost tracking across providers

  • A full trace of every LLM call with cost attribution

  • Zero changes to your existing LLM code

Python

Step 1: Install

Step 2: Wrap your agent

Before:

def my_agent(task: str):
    client = openai.OpenAI()
    response = client.chat.completions.create(
        model="gpt-4o",
        messages=[{"role": "user", "content": task}]
    )
    return response.choices[0].message.content

After:

import wickd

@wickd.agent(budget=wickd.Budget(per_run=5.00, daily=50.00))
def my_agent(task: str):
    client = openai.OpenAI()
    response = client.chat.completions.create(
        model="gpt-4o",
        messages=[{"role": "user", "content": task}]
    )
    return response.choices[0].message.content

my_agent.run("Summarise yesterday's tickets")

That's it. One import, one decorator, call .run() instead of calling the function directly.

Step 3: See it work

[wickd] my_agent | $0.0043 cost | 1 calls | budget: $0.0043/$5.00 | 1203ms | trace: a1b2c3d4

Every run prints a summary. If the budget is exceeded mid-run, execution stops immediately:

[wickd] my_agent | BUDGET EXCEEDED | $5.01/$5.00 | 847 calls | killed

TypeScript

Step 1: Install

npm

Step 2: Wrap your agent

Before:

async function myAgent(task: string) {
  const client = new OpenAI();
  const response = await client.chat.completions.create({
    model: "gpt-4o",
    messages: [{ role: "user", content: task }],
  });
  return response.choices[0].message.content;
}

After:

import { agent, Budget } from "wickd";

const myAgent = agent({
  fn: async (task: string) => {
    const client = new OpenAI();
    const response = await client.chat.completions.create({
      model: "gpt-4o",
      messages: [{ role: "user", content: task }],
    });
    return response.choices[0].message.content;
  },
  budget: new Budget({ perRun: 5.0, daily: 50.0 }),
});

await myAgent.run("Summarise yesterday's tickets");

Same idea. Your LLM code stays untouched. Wickd patches the SDK under the hood and tracks every call.

Zero-code alternative: the proxy

If you don't want to touch your agent code at all, Wickd ships a local proxy:

pip install wickd-ai
wickd start --budget-per-run 5.00 --budget-daily 50

Then point your SDK at it:

export OPENAI_BASE_URL=http://localhost:4319/openai/v1
export ANTHROPIC_BASE_URL

Every LLM call now runs through the proxy. Budgets enforced. Traces collected. No code changes.

Adding notifications

You probably want to know when a budget is hit. Wickd supports console, Slack, and webhook handlers:

@wickd.agent(
    budget=wickd.Budget(per_run=5.00, daily=50.00),
    on_budget_kill=wickd.notify.slack("https://hooks.slack.com/..."),
)
def my_agent(task: str):
    ...

Adding approval gates

For agents that do risky things — database writes, sending emails, deploying code — you can require human approval:

@wickd.approval("database_write")
def update_user_record(user_id, data):
    db.update(user_id, data)

When the agent hits this function, it pauses and asks for approval. Approve or deny. Full audit trail.

What providers are supported

Wickd intercepts calls from OpenAI, Anthropic, and Google GenAI SDKs automatically. Since it patches at the SDK transport layer, anything built on top of these works too:

  • Vercel AI SDK

  • LangChain / LangGraph

  • CrewAI

  • OpenAI Agents SDK

  • Any OpenAI-compatible endpoint

Viewing traces

Every run is traced. Check them from the CLI:

wickd traces            # list recent runs
wickd traces --cost     # sort by cost
wickd traces a1b2c3d4   # detail view for a specific run

Five minutes, done

That's the whole integration. One package install, one decorator or wrapper, and your agent has hard cost ceilings that actually hold. No infrastructure to deploy, no dashboard to set up, no external service to depend on. Everything runs locally, in-process, at zero added latency.