MatSpray: Fusing 2D Material World Knowledge on 3D Geometry Paper • 2512.18314 • Published 13 days ago • 8
GroundingME: Exposing the Visual Grounding Gap in MLLMs through Multi-Dimensional Evaluation Paper • 2512.17495 • Published 14 days ago • 19
GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning Paper • 2507.01006 • Published Jul 1, 2025 • 248
VisPlay: Self-Evolving Vision-Language Models from Images Paper • 2511.15661 • Published Nov 19, 2025 • 42
Scaling Language-Centric Omnimodal Representation Learning Paper • 2510.11693 • Published Oct 13, 2025 • 100
view article Article We’re open-sourcing our text-to-image model and the process behind it Nov 12, 2025 • 76
Tiny Model, Big Logic: Diversity-Driven Optimization Elicits Large-Model Reasoning Ability in VibeThinker-1.5B Paper • 2511.06221 • Published Nov 9, 2025 • 131
view article Article The 1 Billion Token Challenge: Finding the Perfect Pre-training Mix Nov 3, 2025 • 53
π_RL: Online RL Fine-tuning for Flow-based Vision-Language-Action Models Paper • 2510.25889 • Published Oct 29, 2025 • 64
Concerto: Joint 2D-3D Self-Supervised Learning Emerges Spatial Representations Paper • 2510.23607 • Published Oct 27, 2025 • 177
LoftUp: Learning a Coordinate-Based Feature Upsampler for Vision Foundation Models Paper • 2504.14032 • Published Apr 18, 2025 • 7
Baichuan-M2: Scaling Medical Capability with Large Verifier System Paper • 2509.02208 • Published Sep 2, 2025 • 42
Every Attention Matters: An Efficient Hybrid Architecture for Long-Context Reasoning Paper • 2510.19338 • Published Oct 22, 2025 • 114
ARGenSeg: Image Segmentation with Autoregressive Image Generation Model Paper • 2510.20803 • Published Oct 23, 2025 • 9
Bee: A High-Quality Corpus and Full-Stack Suite to Unlock Advanced Fully Open MLLMs Paper • 2510.13795 • Published Oct 15, 2025 • 57
AI for Service: Proactive Assistance with AI Glasses Paper • 2510.14359 • Published Oct 16, 2025 • 74