microsoft/Phi-4-multimodal-instruct Automatic Speech Recognition • 6B • Updated Dec 10, 2025 • 214k • 1.56k
chaoyinshe/llava-med-v1.5-mistral-7b-hf Visual Question Answering • 8B • Updated Dec 4, 2025 • 1.83k • 5
google/pix2struct-widget-captioning-large Visual Question Answering • 1B • Updated Apr 10, 2024 • 54 • 20