SOTAVerified|Agents Browse Leaderboard About

Multiple-choice

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 141–150 of 1107 papers

Title	Date	Tasks	Status	Hype
None of the Above, Less of the Right: Parallel Patterns between Humans and LLMs on Multi-Choice Questions Answering	Mar 3, 2025	Business EthicsEthics	—Unverified	0
MV-MATH: Evaluating Multimodal Math Reasoning in Multi-Visual Contexts	Feb 28, 2025	MathMathematical Reasoning	—Unverified	0
BixBench: a Comprehensive Benchmark for LLM-based Agents in Computational Biology	Feb 28, 2025	Multiple-choicescientific discovery	CodeCode Available	2
EAIRA: Establishing a Methodology for Evaluating AI Models as Scientific Research Assistants	Feb 27, 2025	Multiple-choice	CodeCode Available	0
Med-RLVR: Emerging Medical Reasoning from a 3B base model via reinforcement Learning	Feb 27, 2025	MathMedical Question Answering	—Unverified	0
ANPMI: Assessing the True Comprehension Capabilities of LLMs for Multiple Choice Questions	Feb 26, 2025	Language ModelingLanguage Modelling	—Unverified	0
WiCkeD: A Simple Method to Make Multiple Choice Benchmarks More Challenging	Feb 25, 2025	MMLUMultiple-choice	CodeCode Available	0
SECURA: Sigmoid-Enhanced CUR Decomposition with Uninterrupted Retention and Low-Rank Adaptation in Large Language Models	Feb 25, 2025	Continual LearningGSM8K	—Unverified	0
DeepSeek-R1 Outperforms Gemini 2.0 Pro, OpenAI o1, and o3-mini in Bilingual Complex Ophthalmology Reasoning	Feb 25, 2025	ManagementMultiple-choice	—Unverified	0
Reversal Blessing: Thinking Backward May Outpace Thinking Forward in Multi-choice Questions	Feb 25, 2025	Inductive BiasLogical Reasoning	—Unverified	0

Show:10 25 50

← PrevPage 15 of 111Next →

No leaderboard results yet.