Cetvel: A Unified Benchmark for Evaluating Language Understanding, Generation and Cultural Capacity of LLMs for Turkish Paper • 2508.16431 • Published Aug 22, 2025 • 1
ViLMA: A Zero-Shot Benchmark for Linguistic and Temporal Grounding in Video-Language Models Paper • 2311.07022 • Published Nov 13, 2023 • 1