COMPUTE RACENVIDIA
NVIDIA
VS THE WORLD
Google, AWS, Meta, and Tesla are all building custom AI chips. NVIDIA still owns 85% of inference workloads. Can anyone dethrone Jensen? The answer is more complicated than the headlines suggest.
INFERENCE MARKET SHARE (2026)
NVIDIA (H100/H200)85%
Google (TPU)8%
Custom (AWS/Meta/Tesla)7%
Source: estimated from cloud provider reports and analyst data. Training workloads skew even higher toward NVIDIA.
CHIP SPECS — HEAD TO HEAD
Click any chip for breakdown. TFLOPS = FP16 (BF16 where noted). Cost = estimated market rate or cloud $/hr.
H100 SXM
NVIDIA
$2.6/hr
cloud est.
TFLOPS
1979
MEM BW
3.35TB/s
MEM
80GB
General-purpose AI training + inference
H200 SXM
NVIDIA
$3.2/hr
cloud est.
TFLOPS
1979
MEM BW
4.8TB/s
MEM
141GB
Inference at scale — larger models, higher throughput
TPU v5e
Google
$2.2/hr
cloud est.
TFLOPS
918
MEM BW
1.6TB/s
MEM
16GB
Google internal training + GCP inference
Trainium 3
AWS
$1.85/hr
cloud est.
TFLOPS
2800
MEM BW
5.1TB/s
MEM
96GB
Training large models on AWS at 30% lower cost than H100
Dojo D1
Tesla
TFLOPS
362
MEM BW
10TB/s
MEM
32GB
Video processing for FSD — purpose-built, not general
MTIA v2
Meta
TFLOPS
800
MEM BW
1.2TB/s
MEM
16GB
Inference for Meta recommendation models + Llama serving
WHY CUDA IS THE REAL MOAT
YEARS OF CUDA INVESTMENT
17 years
Launched 2007. Every ML researcher has built on it.
ML FRAMEWORKS ON CUDA
PyTorch, TensorFlow, JAX, Triton
The entire stack assumes NVIDIA hardware.
CUSTOM SILICON CHALLENGE
Rewrite the stack
TPU/Trainium need their own compilers, operators, kernels.
DEVELOPER LOCK-IN
Millions of engineers
Switching cost is institutional, not just monetary.