Local AI Inference GPU Economics
Full Comparison Table · NVIDIA RTX · AMD ROCm · Intel ARC | Consumer & Data Center | LLM Inference Performance
Consumer: Marktplaats.nl • Cloud: Modal.com / Lambda Labs • DGX Systems • March 2026
Click the checkboxes to filter by series. Click the + on any GPU row to add it to comparison.
| GPU Model | VRAM | Bandwidth | FP32 TFLOPS |
FP8 TFLOPS |
INT4 TFLOPS |
FP4 TFLOPS |
Purchase/Rental | Hours for €500 | Llama 8B tok/s |
FP4 tok/s |
INT8 | INT4 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 🟢 NVIDIA RTX 30 Series (Ampere) — INT8 via Tensor Cores, no FP8/INT4 | ||||||||||||
| GPU Model | VRAM | Mem BW | FP32 TF | FP8 TF | INT4 TF | FP4 TF | Purchase Price | Hrs/€500 | Llama 8B tok/s | FP4 tok/s | INT8 HW | INT4 HW |
| RTX 3050 | 8GB | 224 GB/s | 9.1 | — | — | — | €200–270 | — | ~25–35 | 21 🔄 | ✅ | ❌ |
| RTX 3060 Best Mid-Budget | 12GB | 360 GB/s | 12.7 | — | — | — | €200–250 | — | ~35–45 | 34 🔄 | ✅ | ❌ |
| RTX 3060 Ti | 8GB | 448 GB/s | 16.2 | — | — | — | €300–380 | — | ~45–55 | 42 🔄 | ✅ | ❌ |
| RTX 3070 | 8GB | 448 GB/s | 20.3 | — | — | — | €300–350 | — | ~48–58 | 42 🔄 | ✅ | ❌ |
| RTX 3070 Ti | 8GB | 608 GB/s | 21.8 | — | — | — | €380–460 | — | ~50–62 | 57 🔄 | ✅ | ❌ |
| RTX 3080 | 10GB | 760 GB/s | 29.8 | — | — | — | €350–400 | — | ~60–75 | 71 🔄 | ✅ | ❌ |
| RTX 3080 12GB | 12GB | 912 GB/s | 30.6 | — | — | — | €550–650 | — | ~65–80 | 85 🔄 | ✅ | ❌ |
| RTX 3080 Ti | 12GB | 912 GB/s | 34.1 | — | — | — | €550–600 | — | ~68–83 | 85 🔄 | ✅ | ❌ |
| RTX 3090 Best Value | 24GB | 936 GB/s | 35.6 | — | — | — | €650–700 | — | ~70–85 | 87 🔄 | ✅ | ❌ |
| RTX 3090 Ti | 24GB | 1008 GB/s | 40.0 | — | — | — | €1000–1050 | — | ~75–90 | 94 🔄 | ✅ | ❌ |
| 🟢 NVIDIA RTX 40 Series (Ada Lovelace) — FP8 + INT4 via 4th-gen Tensor Cores | ||||||||||||
| GPU Model | VRAM | Mem BW | FP32 TF | FP8 TF | INT4 TF | FP4 TF | Purchase Price | Hrs/€500 | Llama 8B tok/s | FP4 tok/s | INT8 HW | INT4 HW |
| RTX 4060 8GB | 8GB | 272 GB/s | 15.1 | 121 | 242 | — | €300–380 | — | ~40–50 | 25 🔄 | ✅ | ✅ |
| RTX 4060 16GB Best INT4 Budget | 16GB | 272 GB/s | 15.1 | 121 | 242 | — | €400–480 | — | ~40–50 | 25 🔄 | ✅ | ✅ |
| RTX 4060 Ti 8GB | 8GB | 288 GB/s | 22.1 | 177 | 354 | — | €380–460 | — | ~45–55 | 27 🔄 | ✅ | ✅ |
| RTX 4060 Ti 16GB | 16GB | 288 GB/s | 22.1 | 177 | 354 | — | €480–560 | — | ~45–55 | 27 🔄 | ✅ | ✅ |
| RTX 4070 | 12GB | 504 GB/s | 29.1 | 233 | 466 | — | €470–520 | — | ~65–80 | 47 🔄 | ✅ | ✅ |
| RTX 4070 Super | 12GB | 504 GB/s | 35.5 | 284 | 568 | — | €600–650 | — | ~70–85 | 47 🔄 | ✅ | ✅ |
| RTX 4070 Ti | 12GB | 504 GB/s | 40.1 | 321 | 642 | — | €700–800 | — | ~75–90 | 47 🔄 | ✅ | ✅ |
| RTX 4070 Ti Super | 16GB | 672 GB/s | 44.1 | 353 | 706 | — | €800–900 | — | ~85–100 | 63 🔄 | ✅ | ✅ |
| RTX 4080 | 16GB | 716 GB/s | 48.7 | 390 | 780 | — | €850–900 | — | ~95–115 | 67 🔄 | ✅ | ✅ |
| RTX 4080 Super | 16GB | 736 GB/s | 52.2 | 418 | 836 | — | €830–880 | — | ~100–120 | 69 🔄 | ✅ | ✅ |
| RTX 4090 Best Performance | 24GB | 1008 GB/s | 82.6 | 661 | 1322 | — | €2200–2250 | — | ~120–145 | 94 🔄 | ✅ | ✅ |
| 🟢 NVIDIA RTX 50 Series (Blackwell, Jan 2025) — FP8 + INT4 + FP4 via 5th-gen Tensor Cores NEW! | ||||||||||||
| GPU Model | VRAM | Mem BW | FP32 TF | FP8 TF | INT4 TF | FP4 TF | Purchase Price | Hrs/€500 | Llama 8B tok/s | FP4 tok/s | INT8 HW | INT4 HW |
| RTX 5060 | 8GB | 448 GB/s | 19.2 | 154 | 307 | 307 | €320–400* | — | ~55–70 | 67 ⚡ | ✅ | ✅ |
| RTX 5060 Ti 8GB | 8GB | 448 GB/s | 28.0 | 224 | 448 | 448 | €400–480* | — | ~60–75 | 67 ⚡ | ✅ | ✅ |
| RTX 5060 Ti 16GB | 16GB | 448 GB/s | 28.0 | 224 | 448 | 448 | €450–550* | — | ~60–75 | 82 ⚡ | ✅ | ✅ |
| RTX 5070 | 12GB | 672 GB/s | 45.0 | 360 | 720 | 720 | €600–750* | — | ~85–105 | 100 ⚡ | ✅ | ✅ |
| RTX 5070 Ti | 16GB | 896 GB/s | 58.5 | 468 | 936 | 936 | €850–1000* | — | ~100–125 | 134 ⚡ | ✅ | ✅ |
| RTX 5080 | 16GB | 960 GB/s | 65.0 | 520 | 1040 | 1040 | €1250–1300* | — | ~115–140 | 144 ⚡ | ✅ | ✅ |
| RTX 5090 Cutting Edge | 32GB | 1792 GB/s | 125.0 | 1000 | 2000 | 2000 | €2500–2600* | — | ~160–200 | 268 ⚡ | ✅ | ✅ |
| ☁️ NVIDIA Data Center GPUs (Modal Cloud Pricing) Cloud Rental | ||||||||||||
| GPU Model | VRAM | Mem BW | FP32 TF | FP8 TF | INT4 TF | FP4 TF | Rental / Purchase | Hrs/€500 | Llama 8B tok/s | FP4 tok/s | INT8 HW | INT4 HW |
| Nvidia T4 | 16GB | 320 GB/s | 8.1 | — | — | — | $0.59/hr • €450–550 used | 898 hrs | ~40–55 | — | ✅ | ❌ |
| Nvidia L4 | 24GB | 300 GB/s | 30.3 | — | — | — | $0.80/hr • €1,500–2,000 used | 663 hrs | ~60–80 | — | ✅ | ✅ |
| Nvidia A10 | 24GB | 600 GB/s | 31.2 | — | — | — | $1.10/hr • €1,700–2,600 used | 481 hrs | ~70–90 | — | ✅ | ✅ |
| Nvidia L40S | 48GB | 864 GB/s | 91.6 | 733 | 1466 | — | $1.95/hr • €7,500–10,200 used | 272 hrs | ~140–180 | — | ✅ | ✅ |
| Nvidia A100 40GB | 40GB | 1555 GB/s | 19.5 | 312 | 624 | — | $2.10/hr • €7,000–9,000 used | 253 hrs | ~180–220 | — | ✅ | ✅ |
| Nvidia A100 80GB | 80GB | 2039 GB/s | 19.5 | 312 | 624 | — | $2.50/hr • €14,000–20,500 used | 212 hrs | ~200–250 | — | ✅ | ✅ |
| Nvidia H100 | 80GB | 3350 GB/s | 67 | 1979 | 3958 | — | $3.95/hr • €27,500+ used | 134 hrs | ~350–450 | — | ✅ | ✅ |
| Nvidia H200 | 141GB | 4800 GB/s | 67 | 1979 | 3958 | — | $4.54/hr • €37,500+ used | 117 hrs | ~400–500 | — | ✅ | ✅ |
| Nvidia B200 Latest! | 192GB | 8000 GB/s | 4500 | 9000 | 18000 | 18000 | $6.25/hr • rare used | 85 hrs | ~600–800 | — | ✅ | ✅ |
| 🖥️ NVIDIA DGX Spark — Compact AI system (GB10 Grace Blackwell, 128GB unified) ~$4.7k / €2.8k–4.2k | ||||||||||||
| System | VRAM | Mem BW | FP32 TF | FP8 TF | INT4 TF | FP4 TF | Purchase Price | Hrs/€500 | Llama 8B tok/s | FP4 tok/s | INT8 HW | INT4 HW |
| DGX Spark 128GB • FP4 | 128GB | 273 GB/s | ~50 | ~500 | ~1000 | ~1000 | ~$4,700 / €2,800–4,200 | — | ~200–300 | 41 ⚡ | ✅ | ✅ |
| 🔴 AMD ROCm — RDNA 2 (RX 6000 Series) — Community ROCm support, shader INT8 only ROCm 5.x | ||||||||||||
| GPU Model | VRAM | Mem BW | FP32 TF | FP8 TF | INT4 TF† | FP4 TF | Purchase Price | Hrs/€500 | Llama 8B tok/s | FP4 tok/s | INT8 HW | INT4 HW |
| RX 6800 XT | 16GB | 512 GB/s | 20.7 | — | — | — | €280–350 | — | ~55–70 | 48 🔄 | ✅ | ❌ |
| RX 6900 XT | 16GB | 512 GB/s | 23.1 | — | — | — | €350–430 | — | ~60–75 | 48 🔄 | ✅ | ❌ |
| 🔴 AMD ROCm — RDNA 3 (RX 7000 Series) — Official ROCm support, WMMA INT8+INT4 via AI Accelerators ROCm 5.7+ | ||||||||||||
| GPU Model | VRAM | Mem BW | FP32 TF | FP8 TF | INT4 TF (WMMA) | FP4 TF | Purchase Price | Hrs/€500 | Llama 8B tok/s | FP4 tok/s | INT8 HW | INT4 HW |
| RX 7700 XT | 12GB | 432 GB/s | 35.7 | — | ~143 | — | €280–340 | — | ~65–80 | 40 🔄 | ✅ | ✅ |
| RX 7800 XT | 16GB | 576 GB/s | 37.0 | — | ~148 | — | €350–430 | — | ~75–95 | 54 🔄 | ✅ | ✅ |
| RX 7900 GRE | 16GB | 576 GB/s | 45.9 | — | ~184 | — | €400–500 | — | ~75–95 | 54 🔄 | ✅ | ✅ |
| RX 7900 XT | 20GB | 800 GB/s | 53.4 | — | ~214 | — | €550–650 | — | ~90–110 | 75 🔄 | ✅ | ✅ |
| RX 7900 XTX Best AMD Value | 24GB | 960 GB/s | 61.4 | — | ~246 | — | €650–800 | — | ~100–130 | 90 🔄 | ✅ | ✅ |
| 🔴 AMD ROCm — RDNA 4 (RX 9000 Series, Mar 2025) — Full FP8 + INT4 AI Accelerators NEW! | ||||||||||||
| GPU Model | VRAM | Mem BW | FP32 TF | FP8 TF (AI Acc) | INT4 TF (AI Acc) | FP4 TF | Purchase Price | Hrs/€500 | Llama 8B tok/s | FP4 tok/s | INT8 HW | INT4 HW |
| RX 9070 | 16GB | 512 GB/s | ~40.0 | ~320 | ~320 | — | €450–550 | — | ~80–100 | 48 🔄 | ✅ | ✅ |
| RX 9070 XT Best AMD New | 16GB | 640 GB/s | ~54.0 | ~432 | ~432 | — | €550–650 | — | ~100–130 | 60 🔄 | ✅ | ✅ |
| 🔴 AMD Instinct — Data Center GPUs (ROCm Cloud, Lambda Labs pricing) Cloud | ||||||||||||
| GPU Model | VRAM | Mem BW | FP32 TF (CU) | FP8 TF (Matrix) | INT4 TF (Matrix) | FP4 TF | Rental Price/hr | Hrs/€500 | Llama 8B tok/s | FP4 tok/s | INT8 HW | INT4 HW |
| AMD MI300X Flagship | 192GB | 5300 GB/s | 163 | 2614 | 5228 | — | ~$4.00/hr | ~133 hrs | ~600–800 | — | ✅ | ✅ |
| 🔵 Intel ARC Alchemist (A-Series) — XMX INT8 + INT4, SYCL/llama.cpp backend oneAPI | ||||||||||||
| GPU Model | VRAM | Mem BW | FP32 TF | FP8 TF | INT4 TF (XMX) | FP4 TF | Purchase Price | Hrs/€500 | Llama 8B tok/s | FP4 tok/s | INT8 HW | INT4 HW |
| Arc A380 | 6GB | 188 GB/s | 7.0 | — | ~112 | — | €80–130 | — | ~15–25 | 18 🔄 | ✅ | ✅ |
| Arc A580 | 8GB | 512 GB/s | 12.4 | — | ~198 | — | €130–170 | — | ~25–35 | 48 🔄 | ✅ | ✅ |
| Arc A750 | 8GB | 512 GB/s | 17.6 | — | ~282 | — | €180–240 | — | ~30–40 | 48 🔄 | ✅ | ✅ |
| Arc A770 8GB | 8GB | 560 GB/s | 19.7 | — | ~315 | — | €230–290 | — | ~30–45 | 52 🔄 | ✅ | ✅ |
| Arc A770 16GB Best Intel Value | 16GB | 560 GB/s | 19.7 | — | ~315 | — | €290–360 | — | ~30–45 | 52 🔄 | ✅ | ✅ |
| 🔵 Intel ARC Battlemage (B-Series) — Xe2 XMX with improved matrix throughput 2024–2025 | ||||||||||||
| GPU Model | VRAM | Mem BW | FP32 TF | FP8 TF | INT4 TF (XMX Xe2) | FP4 TF | Purchase Price | Hrs/€500 | Llama 8B tok/s | FP4 tok/s | INT8 HW | INT4 HW |
| Arc B50 Entry | 8GB | 224 GB/s | ~7.0 | — | ~224 | — | €100–140* | — | ~18–28 | 21 🔄 | ✅ | ✅ |
| Arc B60 Mid | 8GB | 320 GB/s | ~11.2 | — | ~358 | — | €150–190* | — | ~25–38 | 30 🔄 | ✅ | ✅ |
| Arc B580 Best Intel Budget | 12GB | 456 GB/s | 14.0 | — | ~448 | — | €230–280 | — | ~35–50 | 43 🔄 | ✅ | ✅ |
| Arc B770 New 2025 | 16GB | 616 GB/s | ~24.0 | — | ~768 | — | €350–450* | — | ~50–70 | 58 🔄 | ✅ | ✅ |