SOTAVerified|Agents Browse Leaderboard About

Multiple-choice

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 381–390 of 1107 papers

Title	Date	Tasks	Status	Hype
BiRdQA: A Bilingual Dataset for Question Answering on Tricky Riddles	Sep 23, 2021	Multiple-choiceQuestion Answering	—Unverified	0
Evaluating the Potential of Leading Large Language Models in Reasoning Biology Questions	Nov 5, 2023	Logical ReasoningMultiple-choice	—Unverified	0
From Generalist to Specialist: Improving Large Language Models for Medical Physics Using ARCoT	May 17, 2024	BenchmarkingMultiple-choice	—Unverified	0
Establishing Task Scaling Laws via Compute-Efficient Model Ladders	Dec 5, 2024	Language ModelingLanguage Modelling	—Unverified	0
Evaluating Vision-Language and Large Language Models for Automated Student Assessment in Indonesian Classrooms	Jun 5, 2025	Multiple-choice	—Unverified	0
Evaluating Visual and Cultural Interpretation: The K-Viscuit Benchmark with Human-VLM Collaboration	Jun 24, 2024	DiversityMultiple-choice	—Unverified	0
Evaluation of Automatically Generated Pronoun Reference Questions	Sep 1, 2017	Multiple-choiceReading Comprehension	—Unverified	0
Answer Uncertainty and Unanswerability in Multiple-Choice Machine Reading Comprehension	May 1, 2022	Machine Reading ComprehensionMultiple-choice	—Unverified	0
Analysis of the Cambridge Multiple-Choice Questions Reading Dataset with a Focus on Candidate Response Distribution	Jun 22, 2023	Multiple-choice	—Unverified	0
EQUATOR: A Deterministic Framework for Evaluating LLM Reasoning with Open-Ended Questions. # v1.0.0-beta	Dec 31, 2024	Multiple-choiceQuestion Answering	—Unverified	0

Show:10 25 50

← PrevPage 39 of 111Next →

No leaderboard results yet.