SOTAVerified

Multiple-choice

Papers

Showing 101110 of 1107 papers

TitleStatusHype
VisOnlyQA: Large Vision Language Models Still Struggle with Visual Perception of Geometric InformationCode1
CHOICE: Benchmarking the Remote Sensing Capabilities of Large Vision-Language ModelsCode1
All Languages Matter: Evaluating LMMs on Culturally Diverse 100 LanguagesCode1
VidComposition: Can MLLMs Analyze Compositions in Compiled Videos?Code1
MEG: Medical Knowledge-Augmented Large Language Models for Question AnsweringCode1
MILU: A Multi-task Indic Language Understanding BenchmarkCode1
Delving into the Reversal Curse: How Far Can Large Language Models Generalize?Code1
TimeSeriesExam: A time series understanding examCode1
WorldMedQA-V: a multilingual, multimodal medical examination dataset for multimodal language models evaluationCode1
MMIE: Massive Multimodal Interleaved Comprehension Benchmark for Large Vision-Language ModelsCode1
Show:102550
← PrevPage 11 of 111Next →

No leaderboard results yet.