GPU AFaster
Apple M2 Ultra (192GB)
- VRAM
- 192 GB
- Bandwidth
- 800 GB/s
- Street price
- $5,499
- Vendor
- apple
GPU B
Apple M4 Max (128GB)
- VRAM
- 128 GB
- Bandwidth
- 546 GB/s
- Street price
- $3,999
- Vendor
- apple
The short answer
Both GPUs handle the same models, but Apple M2 Ultra (192GB) is more than 20% faster on 37 of them. For the remaining 3 models, speeds are within 20% — you won't notice the gap. Apple M2 Ultra (192GB) gives consistently better throughput.
1 model A runs, B can't
37 models A is 20%+ faster
5 models B is 20%+ faster
3 equal (both run, <20% diff)
8 too large for either
Model-by-model fit
Click any row for the full breakdown. Tie% shown in the Winner column when both GPUs run the model within 20% of each other.
| Model | Apple M2 Ultra (192GB) | Apple M4 Max (128GB) | Winner |
|---|---|---|---|
c4ai-command-r-v01 35B 35B · command | 10 tok/s · FP16 | 7 tok/s · FP16 | A (faster) |
Command-R+ 104B 104B · command | 6 tok/s · Q8_0 | 4 tok/s · Q8_0 | A (faster) |
DeepSeek R1 Distill Llama 8B 8B · deepseek | 41 tok/s · FP16 | 28 tok/s · FP16 | A (faster) |
DeepSeek R1 Distill Qwen 14B 14.8B · deepseek | 23 tok/s · FP16 | 15 tok/s · FP16 | A (faster) |
DeepSeek R1 Distill Llama 70B 70.6B · deepseek | 5 tok/s · FP16 | 7 tok/s · Q8_0 | B (faster) |
DeepSeek R1 671B 671B · deepseek | Too large | Too large | — |
DeepSeek-V3 685B 685B · deepseek | Too large | Too large | — |
DeepSeek-V3.2 685.4B 685.4B · deepseek | Too large | Too large | — |
gemma-2-9b 9.2B · gemma | 36 tok/s · FP16 | 25 tok/s · FP16 | A (faster) |
gemma-2-27b 27.2B · gemma | 12 tok/s · FP16 | 8 tok/s · FP16 | A (faster) |
Llama 3.1 8B Compact 8B · llama | 37 tok/s · FP16 | 25 tok/s · FP16 | A (faster) |
CodeLlama 34B 34B · llama | 10 tok/s · FP16 | 7 tok/s · FP16 | A (faster) |
CodeLlama 34B 34B · llama | 10 tok/s · FP16 | 7 tok/s · FP16 | A (faster) |
Llama 3.3 70B 70.6B · llama | 5 tok/s · FP16 | 7 tok/s · Q8_0 | B (faster) |
Llama 3.1 70B 70.6B · llama | 5 tok/s · FP16 | 7 tok/s · Q8_0 | B (faster) |
Llama 4 Scout 17B 109B · llama | 5 tok/s · Q8_0 | 5 tok/s · Q6_K | equal |
Llama-4-Maverick-17B-128E 400B · llama | Too large | Too large | — |
Llama 3.1 405B 405B · llama | Too large | Too large | — |
Mistral 7B v0.1 7.25B · mistral | 45 tok/s · FP16 | 31 tok/s · FP16 | A (faster) |
Codestral 22B 22.2B · mistral | 15 tok/s · FP16 | 10 tok/s · FP16 | A (faster) |
Mixtral 8x7B Instruct v0.1 47B · mixtral | 6 tok/s · FP16 | 4 tok/s · FP16 | A (faster) |
Mistral Large 2 123B 123B · mistral | 5 tok/s · Q8_0 | 5 tok/s · Q6_K | equal |
Phi-4-mini 3.8B 3.8B · phi | 84 tok/s · FP16 | 57 tok/s · FP16 | A (faster) |
Phi-4 14B 14B · phi | 21 tok/s · FP16 | 14 tok/s · FP16 | A (faster) |
Qwen 2.5 1.5B 1.5B · qwen | 195 tok/s · FP16 | 133 tok/s · FP16 | A (faster) |
Qwen 2.5 3B 3.1B · qwen | 102 tok/s · FP16 | 69 tok/s · FP16 | A (faster) |
Qwen3.5-4B 4.7B · qwen | 69 tok/s · FP16 | 47 tok/s · FP16 | A (faster) |
Qwen 2.5 7B 7.6B · qwen | 43 tok/s · FP16 | 30 tok/s · FP16 | A (faster) |
Qwen 2.5 7B 7.6B · qwen | 43 tok/s · FP16 | 30 tok/s · FP16 | A (faster) |
Qwen 3 8B 8B · qwen | 37 tok/s · FP16 | 25 tok/s · FP16 | A (faster) |
Qwen3.5-9B 9.7B · qwen | 34 tok/s · FP16 | 23 tok/s · FP16 | A (faster) |
Qwen 3 32B 32B · qwen | 9 tok/s · FP16 | 6 tok/s · FP16 | A (faster) |
Qwen3.5-35B-A3B 36B · qwen | 9 tok/s · FP16 | 6 tok/s · FP16 | A (faster) |
Qwen 2.5 72B 72.7B · qwen | 5 tok/s · FP16 | 6 tok/s · Q8_0 | B (faster) |
Qwen 2.5 72B 72.7B · qwen | 5 tok/s · FP16 | 6 tok/s · Q8_0 | B (faster) |
Llama 3.2 1B 1.24B · llama | 238 tok/s · FP16 | 163 tok/s · FP16 | A (faster) |
Llama 4 Scout 17B 109B · llama | 5 tok/s · Q8_0 | 5 tok/s · Q6_K | equal |
DeepSeek R1 671B 671B · deepseek | Too large | Too large | — |
Gemma 3 27B 27B · gemma | 11 tok/s · FP16 | 7 tok/s · FP16 | A (faster) |
Qwen 3 8B 8B · qwen | 37 tok/s · FP16 | 25 tok/s · FP16 | A (faster) |
Qwen 3 32B 32B · qwen | 9 tok/s · FP16 | 6 tok/s · FP16 | A (faster) |
Llama 3.1 8B Compact 8B · llama | 37 tok/s · FP16 | 25 tok/s · FP16 | A (faster) |
Mixtral 8x7B Instruct v0.1 47B · mixtral | 6 tok/s · FP16 | 4 tok/s · FP16 | A (faster) |
Mistral Small 3.2 24B 24B · mistral | 15 tok/s · FP16 | 10 tok/s · FP16 | A (faster) |
Command A 111B 111B · command | 6 tok/s · Q8_0 | 4 tok/s · Q8_0 | A (faster) |
DeepSeek R1 0528 685B · deepseek | Too large | Too large | — |
DeepSeek-V3-0324 684.5B · deepseek | Too large | Too large | — |
DeepSeek-R1-0528-Qwen3-8B 8.2B · qwen | 42 tok/s · FP16 | 29 tok/s · FP16 | A (faster) |
Qwen3-235B-A22B-Instruct-2507 235B · qwen | 5 tok/s · Q4_K_M | Too large | A (only) |
Qwen3-30B-A3B-Instruct-2507 30B · qwen | 23 tok/s · Q8_0 | 16 tok/s · Q8_0 | A (faster) |
Qwen3-4B-Instruct-2507 4B · qwen | 87 tok/s · FP16 | 59 tok/s · FP16 | A (faster) |
gemma-4-E4B-it 8B · gemma | 44 tok/s · FP16 | 30 tok/s · FP16 | A (faster) |
gemma-4-26B-A4B-it 26.5B · gemma | 26 tok/s · Q8_0 | 18 tok/s · Q8_0 | A (faster) |
gemma-4-31B-it 32.7B · gemma | 21 tok/s · Q8_0 | 15 tok/s · Q8_0 | A (faster) |
Want a different pairing? Browse all comparisons →