MomaGraph: State-Aware Unified Scene Graphs with Vision-Language Model for Embodied Task Planning Paper • 2512.16909 • Published 7 days ago • 1
ROVER: Benchmarking Reciprocal Cross-Modal Reasoning for Omnimodal Generation Paper • 2511.01163 • Published Nov 3 • 31
WEAVE: Unleashing and Benchmarking the In-context Interleaved Comprehension and Generation Paper • 2511.11434 • Published Nov 14 • 44
ViCrit: A Verifiable Reinforcement Learning Proxy Task for Visual Perception in VLMs Paper • 2506.10128 • Published Jun 11 • 22
TraceVLA: Visual Trace Prompting Enhances Spatial-Temporal Awareness for Generalist Robotic Policies Paper • 2412.10345 • Published Dec 13, 2024 • 2
DrM: Mastering Visual Reinforcement Learning through Dormant Ratio Minimization Paper • 2310.19668 • Published Oct 30, 2023 • 3
Premier-TACO: Pretraining Multitask Representation via Temporal Action-Driven Contrastive Loss Paper • 2402.06187 • Published Feb 9, 2024 • 11
Beyond Worst-case Attacks: Robust RL with Adaptive Defense via Non-dominated Policies Paper • 2402.12673 • Published Feb 20, 2024
Game-Theoretic Robust Reinforcement Learning Handles Temporally-Coupled Perturbations Paper • 2307.12062 • Published Jul 22, 2023