SOTAVerified|Agents Browse Leaderboard About

Multiple-choice

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 381–390 of 1107 papers

Title	Date	Tasks	Status	Hype
Reversal Blessing: Thinking Backward May Outpace Thinking Forward in Multi-choice Questions	Feb 25, 2025	Inductive BiasLogical Reasoning	—Unverified	0
DeepSeek-R1 Outperforms Gemini 2.0 Pro, OpenAI o1, and o3-mini in Bilingual Complex Ophthalmology Reasoning	Feb 25, 2025	ManagementMultiple-choice	—Unverified	0
The Lazy Student's Dream: ChatGPT Passing an Engineering Course on Its Own	Feb 23, 2025	Multiple-choice	—Unverified	0
Wrong Answers Can Also Be Useful: PlausibleQA -- A Large-Scale QA Dataset with Answer Plausibility Scores	Feb 22, 2025	Distractor GenerationInformation Retrieval	CodeCode Available	0
LegalBench.PT: A Benchmark for Portuguese Law	Feb 22, 2025	Multiple-choice	—Unverified	0
Moving Beyond Medical Exam Questions: A Clinician-Annotated Dataset of Real-World Tasks and Ambiguity in Mental Healthcare	Feb 22, 2025	Decision MakingMultiple-choice	CodeCode Available	0
MHQA: A Diverse, Knowledge Intensive Mental Health Question Answering Challenge for Language Models	Feb 21, 2025	BenchmarkingDiagnostic	—Unverified	0
Do LLMs Make Mistakes Like Students? Exploring Natural Alignment between Language Models and Human Error Patterns	Feb 21, 2025	Distractor GenerationMultiple-choice	—Unverified	0
Unveiling Cultural Blind Spots: Analyzing the Limitations of mLLMs in Procedural Text Comprehension	Feb 20, 2025	Multiple-choiceReading Comprehension	—Unverified	0
MCQA-Eval: Efficient Confidence Evaluation in NLG with Gold-Standard Correctness Labels	Feb 20, 2025	Multiple-choiceText Generation	—Unverified	0

Show:10 25 50

← PrevPage 39 of 111Next →

No leaderboard results yet.