AILuminate: Introducing v1.0 of the AI Risk and Reliability Benchmark from MLCommons Paper • 2503.05731 • Published Feb 19, 2025 • 3
AILuminate: Introducing v1.0 of the AI Risk and Reliability Benchmark from MLCommons Paper • 2503.05731 • Published Feb 19, 2025 • 3
Introducing v0.5 of the AI Safety Benchmark from MLCommons Paper • 2404.12241 • Published Apr 18, 2024 • 13
view article Article Introducing AI Sheets: a tool to work with datasets using open AI models! +4 Aug 8, 2025 • 107
view article Article 🦸🏻#14: What Is MCP, and Why Is Everyone – Suddenly!– Talking About It? Mar 17, 2025 • 352
Running Featured 125 Open-LLM performances are plateauing, let’s make the leaderboard steep again 🏔 125 Explore and compare advanced language models on a new leaderboard