Skip to content

LightEval Benchmarks

Complete list of supported benchmarks.

Detailed benchmark documentation is in progress.

  • Commonsense Reasoning: HellaSwag, WinoGrande, OpenBookQA
  • Scientific Reasoning: ARC Easy, ARC Challenge
  • Physical Commonsense: PIQA
  • Truthfulness: TruthfulQA
  • Mathematics: GSM8K, MATH
  • Knowledge: MMLU, TriviaQA
  • Language Understanding: GLUE benchmarks

For complete documentation, see the LightEval README.