Skip to content

LightEval Benchmarks

Complete list of supported benchmarks.

Coming Soon

Detailed benchmark documentation is in progress.

Benchmark Categories

Commonsense Reasoning: HellaSwag, WinoGrande, OpenBookQA
Scientific Reasoning: ARC Easy, ARC Challenge
Physical Commonsense: PIQA
Truthfulness: TruthfulQA
Mathematics: GSM8K, MATH
Knowledge: MMLU, TriviaQA
Language Understanding: GLUE benchmarks

For complete documentation, see the LightEval README.