Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
alphaXiv 's Collections
attention-is-not-all-you-need
spurious-rewards
Agent-R1
Reproducing-TRM

spurious-rewards

updated Jan 6
Upvote
-

  • alphaXiv/spurious-rewards-rlvr-training-qwen-2.5-1.5b-math-ckpt-400

    2B • Updated Jan 1 • 1

  • alphaXiv/spurious-rewards-rlvr-training-qwen-2.5-1.5b-math-ckpt-1000

    2B • Updated Jan 1 • 3

  • alphaXiv/spurious-rewards-rlvr-training-qwen-2.5-1.5b-math-ckpt-200

    2B • Updated Jan 1 • 1

  • alphaXiv/spurious-rewards-rlvr-training-qwen-2.5-1.5b-math-ckpt-50

    2B • Updated Jan 1 • 1

  • alphaXiv/spurious-rewards-reasoning-traces

    Updated Jan 6

  • alphaXiv/spurious-rewards-data

    Preview • Updated Jan 6 • 6
Upvote
-
  • Collection guide
  • Browse collections
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs