COMPUTE RACE

NVIDIA
VS THE WORLD

Google, AWS, Meta, and Tesla are all building custom AI chips. NVIDIA still owns 85% of inference workloads. Can anyone dethrone Jensen? The answer is more complicated than the headlines suggest.

INFERENCE MARKET SHARE (2026)

NVIDIA (H100/H200)
85%
Google (TPU)
8%
Custom (AWS/Meta/Tesla)
7%

Source: estimated from cloud provider reports and analyst data. Training workloads skew even higher toward NVIDIA.

CHIP SPECS — HEAD TO HEAD

Click any chip for breakdown. TFLOPS = FP16 (BF16 where noted). Cost = estimated market rate or cloud $/hr.

H100 SXM
NVIDIA
$2.6/hr
cloud est.
TFLOPS
1979
MEM BW
3.35TB/s
MEM
80GB

General-purpose AI training + inference

H200 SXM
NVIDIA
$3.2/hr
cloud est.
TFLOPS
1979
MEM BW
4.8TB/s
MEM
141GB

Inference at scale — larger models, higher throughput

TPU v5e
Google
$2.2/hr
cloud est.
TFLOPS
918
MEM BW
1.6TB/s
MEM
16GB

Google internal training + GCP inference

Trainium 3
AWS
$1.85/hr
cloud est.
TFLOPS
2800
MEM BW
5.1TB/s
MEM
96GB

Training large models on AWS at 30% lower cost than H100

Dojo D1
Tesla
TFLOPS
362
MEM BW
10TB/s
MEM
32GB

Video processing for FSD — purpose-built, not general

MTIA v2
Meta
TFLOPS
800
MEM BW
1.2TB/s
MEM
16GB

Inference for Meta recommendation models + Llama serving

WHY CUDA IS THE REAL MOAT
YEARS OF CUDA INVESTMENT
17 years
Launched 2007. Every ML researcher has built on it.
ML FRAMEWORKS ON CUDA
PyTorch, TensorFlow, JAX, Triton
The entire stack assumes NVIDIA hardware.
CUSTOM SILICON CHALLENGE
Rewrite the stack
TPU/Trainium need their own compilers, operators, kernels.
DEVELOPER LOCK-IN
Millions of engineers
Switching cost is institutional, not just monetary.