Value Density Score
Benchmark performance per dollar of input cost. Computed as benchmark_mmlu / input_price_per_1m. Only computed when both inputs are verified. Links to /methodology#vds.
Filter by value
Distribution — click row to filter
8601→
17.721→
160.781→
76.561→
29.571→
26.331→
300.81→
5.791→
130.741→
421→
2501→
158.21→
68.721→
46.671→
2.881→
8.641→
35.481→
546.671→
6.151→
30.281→
717.51→
129.711→
327.781→
0.731→
24 entities
| Entity ↕ | Value Density Score ↕ | Type | Integrations ↓ |
|---|---|---|---|
| Claude 3 Sonnet | 26.33 | — | 48 |
| o1 | 6.15 | — | 48 |
| GPT-4 Turbo | 8.64 | — | 47 |
| Mixtral 8x7B | 130.74 | — | 43 |
| GPT-4o | 35.48 | — | 37 |
| Gemini 1.0 Pro | 158.2 | — | 34 |
| Mistral 7B | 250 | — | 33 |
| Llama 2 70B | 76.56 | — | 32 |
| DeepSeek R1 | 129.71 | — | 32 |
| Mistral Large 2 | 42 | — | 30 |
| Qwen 2.5 72B | 717.5 | — | 30 |
| Llama 3 70B | 160.78 | — | 29 |
| Claude 3.5 Sonnet | 29.57 | — | 29 |
| Llama 3.1 405B | 17.72 | — | 28 |
| GPT-4 | 2.88 | — | 27 |
| GPT-3.5 Turbo | 46.67 | — | 26 |
| Command R+ | 30.28 | — | 21 |
| DeepSeek V3 | 327.78 | — | 18 |
| Llama 3.3 70B | 860 | — | 17 |
| GPT-4o mini | 546.67 | — | 17 |
| Claude 3 Haiku | 300.8 | — | 16 |
| GPT-3 (davinci-002) | 0.73 | — | 16 |
| Claude 3 Opus | 5.79 | — | 15 |
| Gemini 1.5 Pro | 68.72 | — | 14 |