Cost & Performance Intermediate
Cache Economics Explained
Understand prompt caching tiers and their 90% savings on repeated content
Command
"color:#9CA3AF;font-style:italic"># First call: pays cache write cost $ "color:#7C5CFC">claude -p "Hello" "color:#d97757">--output-format json "color:#9CA3AF;font-style:italic"># Subsequent calls within 5 min: 90% savings on cached content $ "color:#7C5CFC">claude -p "Hello again" "color:#d97757">--output-format json
Response
| Cache Tier | Duration | Cost | Savings | |----------------|----------|------------|--------| | Ephemeral write | 5 min | 1.25x input | — | | Long-lived write | 1 hour | 2.0x input | — | | Cache read | — | 0.1x input | 90% |
Parsing Code
059669">">// 10-turn Opus session savings: 059669">">// Without cache: 10 × 15K tokens × $15/M = $2.25 059669">">// With cache: 1 write + 9 reads = $0.45 + $0.20 = $0.65 059669">">// Savings: 71% on input tokens 059669">">// // 100-turn session: $50-100 without → $10-19 with caching
Gotchas
! Ephemeral cache expires after 5 minutes of inactivity — next call pays write cost again
! Run similar prompts back-to-back to maximize cache read hits (90% savings)