SOTAVerified

Multiple-choice

Papers

Showing 121130 of 1107 papers

TitleStatusHype
Latxa: An Open Language Model and Evaluation Suite for BasqueCode1
Assessing the Chemical Intelligence of Large Language ModelsCode1
Let Androids Dream of Electric Sheep: A Human-like Image Implication Understanding and Reasoning FrameworkCode1
Leveraging Large Language Models for Learning Complex Legal Concepts through StorytellingCode1
LibriSQA: A Novel Dataset and Framework for Spoken Question Answering with Large Language ModelsCode1
LifeQA: A Real-life Dataset for Video Question AnsweringCode1
EduQG: A Multi-format Multiple Choice Dataset for the Educational DomainCode1
EgoSchema: A Diagnostic Benchmark for Very Long-form Video Language UnderstandingCode1
Evaluating GPT-3.5 and GPT-4 Models on Brazilian University Admission ExamsCode1
Delving into the Reversal Curse: How Far Can Large Language Models Generalize?Code1
Show:102550
← PrevPage 13 of 111Next →

No leaderboard results yet.