Add GPQA evaluation result

#35
by burtenshaw HF Staff - opened

Evaluation Results

This PR adds structured evaluation results using the new .eval_results/ format.

What This Enables

  • Model Page: Results appear on the model page with benchmark links
  • Leaderboards: Scores are aggregated into benchmark dataset leaderboards
  • Verification: Support for cryptographic verification of evaluation runs

Model Evaluation Results

Format Details

Results are stored as YAML in .eval_results/ folder. See the Eval Results Documentation for the full specification.


Generated by community-evals

burtenshaw changed pull request status to closed

Sign up or log in to comment