When Visualizing is the First Step to Reasoning: MIRA, a Benchmark for Visual Chain-of-Thought Paper • 2511.02779 • Published Nov 4, 2025 • 59
AHELM: A Holistic Evaluation of Audio-Language Models Paper • 2508.21376 • Published Aug 29, 2025 • 9