Group-in-Group Policy Optimization for LLM Agent Training Paper • 2505.10978 • Published May 16, 2025 • 18
Mind the Gap: Bridging Thought Leap for Improved Chain-of-Thought Tuning Paper • 2505.14684 • Published May 20, 2025 • 24
SVGenius: Benchmarking LLMs in SVG Understanding, Editing and Generation Paper • 2506.03139 • Published Jun 3, 2025 • 17
ViewSpatial-Bench: Evaluating Multi-perspective Spatial Localization in Vision-Language Models Paper • 2505.21500 • Published May 27, 2025 • 13
OmniEAR: Benchmarking Agent Reasoning in Embodied Tasks Paper • 2508.05614 • Published Aug 7, 2025 • 20
GUI-G^2: Gaussian Reward Modeling for GUI Grounding Paper • 2507.15846 • Published Jul 21, 2025 • 133
ViewSpatial-Bench: Evaluating Multi-perspective Spatial Localization in Vision-Language Models Paper • 2505.21500 • Published May 27, 2025 • 13
OmniEAR: Benchmarking Agent Reasoning in Embodied Tasks Paper • 2508.05614 • Published Aug 7, 2025 • 20