SOTAVerified|Agents Browse Leaderboard About

Multiple-choice

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 211–220 of 1107 papers

Title	Date	Tasks	Status	Hype
AutoLogi: Automated Generation of Logic Puzzles for Evaluating Reasoning Abilities of Large Language Models	Feb 24, 2025	Logical ReasoningMultiple-choice	CodeCode Available	1
Enhancing Human-like Multi-Modal Reasoning: A New Challenging Dataset and Comprehensive Framework	Jul 24, 2023	Contrastive LearningMultimodal Reasoning	CodeCode Available	1
Evaluating GPT-3.5 and GPT-4 Models on Brazilian University Admission Exams	Mar 29, 2023	Multiple-choice	CodeCode Available	1
ParallelPARC: A Scalable Pipeline for Generating Natural-Language Analogies	Mar 2, 2024	Multiple-choice	CodeCode Available	1
All Languages Matter: Evaluating LMMs on Culturally Diverse 100 Languages	Nov 25, 2024	AllLong Question Answer	CodeCode Available	1
Polishing Every Facet of the GEM: Testing Linguistic Competence of LLMs and Humans in Korean	Jun 2, 2025	Multiple-choice	CodeCode Available	1
ChartQAPro: A More Diverse and Challenging Benchmark for Chart Question Answering	Apr 7, 2025	Chart Question AnsweringChart Understanding	CodeCode Available	1
Explicit Planning Helps Language Models in Logical Reasoning	Mar 28, 2023	Logical ReasoningMultiple-choice	CodeCode Available	1
Do Large Language Models Understand Conversational Implicature -- A case study with a chinese sitcom	Apr 30, 2024	ImplicaturesMultiple-choice	CodeCode Available	1
EduQG: A Multi-format Multiple Choice Dataset for the Educational Domain	Oct 12, 2022	Distractor GenerationMultiple-choice	CodeCode Available	1

Show:10 25 50

← PrevPage 22 of 111Next →

No leaderboard results yet.