GroupRank: A Groupwise Reranking Paradigm Driven by Reinforcement Learning Paper • 2511.11653 • Published Nov 10, 2025 • 55
microsoft/Phi-4-multimodal-instruct Automatic Speech Recognition • 6B • Updated 23 days ago • 248k • 1.55k
SimpleRL-Zoo Collection The collection for the Paper "SimpleRL-Zoo: Investigating and Taming Zero Reinforcement Learning for Open Base Models in the Wild" • 13 items • Updated May 5, 2025 • 8
nishadsinghi/math7500_train_solutions_DeepSeek-R1-Distill-Qwen-7B_32K_tokens Viewer • Updated Feb 13, 2025 • 7.45k • 6 • 2
Running on CPU Upgrade 13.8k Open LLM Leaderboard 🏆 13.8k Track, rank and evaluate open LLMs and chatbots