LiteAttention: A Temporal Sparse Attention for Diffusion Transformers Paper • 2511.11062 • Published Nov 14, 2025 • 31
MMaDA-Parallel: Multimodal Large Diffusion Language Models for Thinking-Aware Editing and Generation Paper • 2511.09611 • Published Nov 12, 2025 • 69
DyPE: Dynamic Position Extrapolation for Ultra High Resolution Diffusion Paper • 2510.20766 • Published Oct 23, 2025 • 34
Advancing Speech Understanding in Speech-Aware Language Models with GRPO Paper • 2509.16990 • Published Sep 21, 2025 • 18
Story2Board: A Training-Free Approach for Expressive Storyboard Generation Paper • 2508.09983 • Published Aug 13, 2025 • 69
Auto-Regressive vs Flow-Matching: a Comparative Study of Modeling Paradigms for Text-to-Music Generation Paper • 2506.08570 • Published Jun 10, 2025 • 33
Self Forcing: Bridging the Train-Test Gap in Autoregressive Video Diffusion Paper • 2506.08009 • Published Jun 9, 2025 • 30
Scaling Analysis of Interleaved Speech-Text Language Models Paper • 2504.02398 • Published Apr 3, 2025 • 31
OmnimatteZero: Training-free Real-time Omnimatte with Pre-trained Video Diffusion Models Paper • 2503.18033 • Published Mar 23, 2025 • 27
Single Image Iterative Subject-driven Generation and Editing Paper • 2503.16025 • Published Mar 20, 2025 • 14
"Principal Components" Enable A New Language of Images Paper • 2503.08685 • Published Mar 11, 2025 • 12
RewardSDS: Aligning Score Distillation via Reward-Weighted Sampling Paper • 2503.09601 • Published Mar 12, 2025 • 16
Slamming: Training a Speech Language Model on One GPU in a Day Paper • 2502.15814 • Published Feb 19, 2025 • 69
Can this Model Also Recognize Dogs? Zero-Shot Model Search from Weights Paper • 2502.09619 • Published Feb 13, 2025 • 35
Inference-Time Scaling for Diffusion Models beyond Scaling Denoising Steps Paper • 2501.09732 • Published Jan 16, 2025 • 71