SOTAVerified|Agents Browse Leaderboard About

Multiple-choice

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 131–140 of 1107 papers

Title	Date	Tasks	Status	Hype
LLM-Coordination: Evaluating and Analyzing Multi-agent Coordination Abilities in Large Language Models	Oct 5, 2023	Common Sense ReasoningMultiple-choice	CodeCode Available	1
Explaining NLP Models via Minimal Contrastive Editing (MiCE)	Dec 27, 2020	counterfactualMultiple-choice	CodeCode Available	1
Enhancing Human-like Multi-Modal Reasoning: A New Challenging Dataset and Comprehensive Framework	Jul 24, 2023	Contrastive LearningMultimodal Reasoning	CodeCode Available	1
EgoSchema: A Diagnostic Benchmark for Very Long-form Video Language Understanding	Aug 17, 2023	DiagnosticEgoSchema	CodeCode Available	1
Enhancing Knowledge Tracing with Concept Map and Response Disentanglement	Aug 23, 2024	DisentanglementKnowledge Tracing	CodeCode Available	1
EduQG: A Multi-format Multiple Choice Dataset for the Educational Domain	Oct 12, 2022	Distractor GenerationMultiple-choice	CodeCode Available	1
Ranked Voting based Self-Consistency of Large Language Models	May 16, 2025	Multiple-choiceOpen-Ended Question Answering	CodeCode Available	1
E-EVAL: A Comprehensive Chinese K-12 Education Evaluation Benchmark for Large Language Models	Jan 29, 2024	EthicsMultiple-choice	CodeCode Available	1
Estimating Contamination via Perplexity: Quantifying Memorisation in Language Model Evaluation	Sep 19, 2023	Language Model EvaluationLanguage Modeling	CodeCode Available	1
An Open Source Data Contamination Report for Large Language Models	Oct 26, 2023	HellaSwagLanguage Modeling	CodeCode Available	1

Show:10 25 50

← PrevPage 14 of 111Next →

No leaderboard results yet.