SOTAVerified|Agents Browse Leaderboard About

Multiple-choice

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 181–190 of 1107 papers

Title	Date	Tasks	Status	Hype
LLMEval-Med: A Real-world Clinical Benchmark for Medical LLMs with Physician Validation	Jun 4, 2025	Multiple-choice	CodeCode Available	1
BLEnD: A Benchmark for LLMs on Everyday Knowledge in Diverse Cultures and Languages	Jun 14, 2024	Multiple-choice	CodeCode Available	1
A Few More Examples May Be Worth Billions of Parameters	Oct 8, 2021	Extractive Question-AnsweringMultiple-choice	CodeCode Available	1
Automated Generation of Challenging Multiple-Choice Questions for Vision Language Model Evaluation	Jan 6, 2025	Language Model EvaluationLanguage Modeling	CodeCode Available	1
Enhancing Human-like Multi-Modal Reasoning: A New Challenging Dataset and Comprehensive Framework	Jul 24, 2023	Contrastive LearningMultimodal Reasoning	CodeCode Available	1
Long Horizon Temperature Scaling	Feb 7, 2023	Multiple-choice	CodeCode Available	1
Evaluating language models as risk scores	Jul 19, 2024	Multiple-choiceQuestion Answering	CodeCode Available	1
BRAINTEASER: Lateral Thinking Puzzles for Large Language Models	Oct 8, 2023	Distractor GenerationLanguage Modelling	CodeCode Available	1
EduQG: A Multi-format Multiple Choice Dataset for the Educational Domain	Oct 12, 2022	Distractor GenerationMultiple-choice	CodeCode Available	1
E-EVAL: A Comprehensive Chinese K-12 Education Evaluation Benchmark for Large Language Models	Jan 29, 2024	EthicsMultiple-choice	CodeCode Available	1

Show:10 25 50

← PrevPage 19 of 111Next →

No leaderboard results yet.