SOTAVerified

Multiple-choice

Papers

Showing 111120 of 1107 papers

TitleStatusHype
Fine-tuning Multimodal Large Language Models for Product BundlingCode1
ARMAN: Pre-training with Semantically Selecting and Reordering of Sentences for Persian Abstractive SummarizationCode1
E-EVAL: A Comprehensive Chinese K-12 Education Evaluation Benchmark for Large Language ModelsCode1
EduQG: A Multi-format Multiple Choice Dataset for the Educational DomainCode1
EgoSchema: A Diagnostic Benchmark for Very Long-form Video Language UnderstandingCode1
Do Large Language Models Understand Conversational Implicature -- A case study with a chinese sitcomCode1
IndicNLPSuite: Monolingual Corpora, Evaluation Benchmarks and Pre-trained Multilingual Language Models for Indian LanguagesCode1
InfiniBench: A Comprehensive Benchmark for Large Multimodal Models in Very Long Video UnderstandingCode1
Delving into the Reversal Curse: How Far Can Large Language Models Generalize?Code1
Enhancing Human-like Multi-Modal Reasoning: A New Challenging Dataset and Comprehensive FrameworkCode1
Show:102550
← PrevPage 12 of 111Next →

No leaderboard results yet.