SOTAVerified|Agents Browse Leaderboard About

Multiple-choice

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 81–90 of 1107 papers

Title	Date	Tasks	Status	Hype
Explicit Planning Helps Language Models in Logical Reasoning	Mar 28, 2023	Logical ReasoningMultiple-choice	CodeCode Available	1
LLM-Coordination: Evaluating and Analyzing Multi-agent Coordination Abilities in Large Language Models	Oct 5, 2023	Common Sense ReasoningMultiple-choice	CodeCode Available	1
Evaluating the Knowledge Dependency of Questions	Nov 21, 2022	Multiple-choice	CodeCode Available	1
Enhancing Knowledge Tracing with Concept Map and Response Disentanglement	Aug 23, 2024	DisentanglementKnowledge Tracing	CodeCode Available	1
All Languages Matter: Evaluating LMMs on Culturally Diverse 100 Languages	Nov 25, 2024	AllLong Question Answer	CodeCode Available	1
Estimating Contamination via Perplexity: Quantifying Memorisation in Language Model Evaluation	Sep 19, 2023	Language Model EvaluationLanguage Modeling	CodeCode Available	1
EgoSchema: A Diagnostic Benchmark for Very Long-form Video Language Understanding	Aug 17, 2023	DiagnosticEgoSchema	CodeCode Available	1
Enhancing Human-like Multi-Modal Reasoning: A New Challenging Dataset and Comprehensive Framework	Jul 24, 2023	Contrastive LearningMultimodal Reasoning	CodeCode Available	1
Evaluating GPT-3.5 and GPT-4 Models on Brazilian University Admission Exams	Mar 29, 2023	Multiple-choice	CodeCode Available	1
GIE-Bench: Towards Grounded Evaluation for Text-Guided Image Editing	May 16, 2025	Instruction FollowingMultiple-choice	CodeCode Available	1

Show:10 25 50

← PrevPage 9 of 111Next →

No leaderboard results yet.