Large-scale Pre-training for Grounded Video Caption Generation Paper β’ 2503.10781 β’ Published Mar 13, 2025 β’ 16