Cost & Performance Intermediate

Cache Economics Explained

Understand prompt caching tiers and their 90% savings on repeated content

Command

"color:#9CA3AF;font-style:italic"># First call: pays cache write cost
$ "color:#7C5CFC">claude -p "Hello" "color:#d97757">--output-format json
  
"color:#9CA3AF;font-style:italic"># Subsequent calls within 5 min: 90% savings on cached content
$ "color:#7C5CFC">claude -p "Hello again" "color:#d97757">--output-format json

Response

| Cache Tier     | Duration | Cost       | Savings |
|----------------|----------|------------|--------|
| Ephemeral write | 5 min   | 1.25x input | —      |
| Long-lived write | 1 hour | 2.0x input  | —      |
| Cache read      | —       | 0.1x input  | 90%    |

Parsing Code

059669">">// 10-turn Opus session savings:
059669">">// Without cache: 10 × 15K tokens × $15/M = $2.25
059669">">// With cache: 1 write + 9 reads = $0.45 + $0.20 = $0.65
059669">">// Savings: 71% on input tokens
059669">">//
// 100-turn session: $50-100 without → $10-19 with caching

Gotchas

! Ephemeral cache expires after 5 minutes of inactivity — next call pays write cost again

! Run similar prompts back-to-back to maximize cache read hits (90% savings)

Command

Response

Parsing Code

Gotchas

Related Recipes