DiffThinker: Towards Generative Multimodal Reasoning with Diffusion Models Paper • 2512.24165 • Published 26 days ago • 50
Robust and Calibrated Detection of Authentic Multimedia Content Paper • 2512.15182 • Published Dec 17, 2025 • 17
Error-Free Linear Attention is a Free Lunch: Exact Solution from Continuous-Time Dynamics Paper • 2512.12602 • Published Dec 14, 2025 • 43
Depth Anything 3: Recovering the Visual Space from Any Views Paper • 2511.10647 • Published Nov 13, 2025 • 97
PAN: A World Model for General, Interactable, and Long-Horizon World Simulation Paper • 2511.09057 • Published Nov 12, 2025 • 79
VLA^2: Empowering Vision-Language-Action Models with an Agentic Framework for Unseen Concept Manipulation Paper • 2510.14902 • Published Oct 16, 2025 • 17
Beyond One World: Benchmarking Super Heros in Role-Playing Across Multiversal Contexts Paper • 2510.14351 • Published Oct 16, 2025 • 2
Learning an Image Editing Model without Image Editing Pairs Paper • 2510.14978 • Published Oct 16, 2025 • 9
Attention Is All You Need for KV Cache in Diffusion LLMs Paper • 2510.14973 • Published Oct 16, 2025 • 42
One-for-All: Generalized LoRA for Parameter-Efficient Fine-tuning Paper • 2306.07967 • Published Jun 13, 2023 • 25
AHELM: A Holistic Evaluation of Audio-Language Models Paper • 2508.21376 • Published Aug 29, 2025 • 9
SAM2Long: Enhancing SAM 2 for Long Video Segmentation with a Training-Free Memory Tree Paper • 2410.16268 • Published Oct 21, 2024 • 69
GPTailor: Large Language Model Pruning Through Layer Cutting and Stitching Paper • 2506.20480 • Published Jun 25, 2025 • 7
Parallelizing Linear Transformers with the Delta Rule over Sequence Length Paper • 2406.06484 • Published Jun 10, 2024 • 4
Essential-Web v1.0: 24T tokens of organized web data Paper • 2506.14111 • Published Jun 17, 2025 • 46
Medusa: Simple LLM Inference Acceleration Framework with Multiple Decoding Heads Paper • 2401.10774 • Published Jan 19, 2024 • 59