Kvasir-VQA-x1: A Multimodal Dataset for Medical Reasoning and Robust MedVQA in Gastrointestinal Endoscopy Paper • 2506.09958 • Published Jun 11, 2025 • 1
Point, Detect, Count: Multi-Task Medical Image Understanding with Instruction-Tuned Vision-Language Models Paper • 2505.16647 • Published May 22, 2025 • 1
SoccerChat: Integrating Multimodal Data for Enhanced Soccer Game Understanding Paper • 2505.16630 • Published May 22, 2025