live · 671 entities · 322 companies
AI node / attr / benchmark_mmlu 33 entities

MMLU

Massive Multitask Language Understanding benchmark score. Tests knowledge across 57 academic subjects. Scores above 90 indicate frontier-level capability. Not directly comparable across model architectures.

33 entities tracked · number · performance
82
2
88.7
2
75.2
2
86.4
2
88.5
2
86
1
88.6
1
68.9
1
88.3
1
79
1
86.8
1
87.5
1
70.6
1
84
1
62.5
1
79.1
1
87.9
1
85.9
1
78.9
1
88.9
1
70
1
92.3
1
81.5
1
75.7
1
85.1
1
86.1
1
90.8
1
43.9
1