COMPUTE RACERENT
RENT
OR
OWN
The most important infrastructure decision in AI right now is whether to rent compute from AWS/Azure/GCP or build it yourself. The answer depends almost entirely on one number: your annual API spend.
● Cloud (rent)● Owned (build)
BREAK-EVEN THRESHOLD
~$500K/yr
Annual cloud API spend where owned infra starts paying off
OWNED PAYBACK PERIOD
12–18 mo
At $2–5M scale, ROI realised within 18 months
COST REDUCTION
60–80%
Per-token cost on self-hosted vs frontier API at equivalent capability
DEEPSEEK EFFICIENCY
$6M
Frontier model training cost. Old number is $100M+. The bar is falling.
Startup (<$50K/yr API spend)Use Cloud
CLOUD
$3–12/1M tokens
No infra overhead, instant start, pay as you go
OWNED
$200K+ cluster cost
Break-even is years away. Not viable.
Growth ($50K–$500K/yr)Hybrid
CLOUD
$50K–500K/yr
Explore self-hosted open models to reduce cost selectively
OWNED
Partial (1–4 H100s)
Self-host open models for dev/test; cloud for production frontier
Scale ($500K–$5M/yr)Evaluate
CLOUD
Expensive
TCO analysis needed. Cloud overhead starts to compound.
OWNED
$1–5M cluster
18-month payback on a 4×H100 node at $2M infra spend
Enterprise ($5M+/yr)Go Owned
CLOUD
Very expensive
Custom enterprise pricing helps; still more expensive than owned at scale
OWNED
$10M+ cluster
Direct payback within 12-18 months at this spend level