Skip to content
cost 3 min
cost gotcha cli

152x Over Budget: The Claude CLI Trap Nobody Warns You About

The --max-budget-usd flag isn't a hard cap. It's checked between turns, not mid-generation — and a single turn can overshoot by 152x. Here's how it works and how to protect yourself.

Tuna Ozmen · · 3 min read

The Test

I ran this command expecting Claude to stop after spending a tenth of a cent:

Terminal window
claude -p "Write a 5000 word essay about the history of computing" \
--output-format json \
--max-budget-usd 0.001

Two minutes later, the response came back. Claude had written the entire essay. The bill: $0.152. That is 152 times the budget I set.

Why This Happens

The --max-budget-usd flag is not a hard cap. It is a between-turn check. Here is the actual enforcement sequence:

  1. Claude starts generating a response
  2. The full response completes (one turn)
  3. Then the budget is checked
  4. If exceeded, the next turn does not start

Step 2 is the problem. A single turn runs to completion before any budget check fires. If that turn generates 5,446 output tokens over 126 seconds, the budget flag can only watch.

Here is the actual JSON response from that test:

{
"type": "result",
"subtype": "error_max_budget_usd",
"is_error": false,
"total_cost_usd": 0.1521415,
"modelUsage": {
"claude-opus-4-6": {
"outputTokens": 5446,
"cacheReadInputTokens": 14253,
"costUSD": 0.1521415
}
}
}

Notice is_error is false. This is not a crash. It is a controlled stop that happened after the money was already spent.

The Minimum Floor Makes It Worse

Every Claude CLI call has a minimum cost from the system prompt alone. For Opus, that floor is approximately $0.016 per call — about 14,253 cached tokens that must be read before Claude even sees your prompt.

This means setting --max-budget-usd 0.01 will always trigger a budget error with Opus. The system prompt tax exceeds your budget before a single output token is generated.

At micro-budgets, the overshoot ratios are dramatic:

BudgetActual CostOvershoot
$0.001$0.04848x (short prompt)
$0.001$0.152152x (long prompt)
$0.01$0.0171.7x
$0.10$0.0160.16x (under budget)

The pattern is clear: below $0.02 with Opus, the budget flag is just a “run once and stop” mechanism.

The Fix

Set realistic minimums. For Opus single-turn tasks, use at least $0.10. For multi-turn sessions, use $5.00 or more.

Detect overshoot in automation. Check the exit code and subtype field:

Terminal window
RESULT=$(claude -p "Your prompt" --output-format json --max-budget-usd 1.00)
SUBTYPE=$(echo "$RESULT" | jq -r '.subtype')
COST=$(echo "$RESULT" | jq '.total_cost_usd')
if [ "$SUBTYPE" = "error_max_budget_usd" ]; then
echo "Budget exceeded: spent $COST"
fi

Track cumulative costs externally. The budget flag tracks per-session spend, but for CI/CD pipelines running hundreds of calls, you need a wrapper that accumulates total_cost_usd across invocations and kills the pipeline when a threshold is hit.

Use Sonnet for cheap tasks. Sonnet’s minimum per-call cost is $0.005 — roughly 3x cheaper than Opus. For formatting, summarization, and simple questions, --model claude-sonnet-4-6 is the right call.

The Mental Model

Think of --max-budget-usd as a governor on a car, not a brick wall. It limits how many laps you take, but it cannot stop the car mid-lap. Each lap (turn) runs to completion, and the governor only checks between laps.

For hard real-time budget enforcement, you need external tooling. The CLI’s budget flag is a useful guardrail, not a billing guarantee.

Master Claude CLI cost controls

Read the full guide
Found this useful? Share it with your team.
Share