SOTAVerified|Agents Browse Leaderboard About

Multiple-choice

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 181–190 of 1107 papers

Title	Date	Tasks	Status	Hype	Score
E-EVAL: A Comprehensive Chinese K-12 Education Evaluation Benchmark for Large Language Models	Jan 29, 2024	EthicsMultiple-choice	CodeCode Available	1	5
BLEnD: A Benchmark for LLMs on Everyday Knowledge in Diverse Cultures and Languages	Jun 14, 2024	Multiple-choice	CodeCode Available	1	5
Complex Reasoning over Logical Queries on Commonsense Knowledge Graphs	Mar 12, 2024	Knowledge GraphsMultiple-choice	CodeCode Available	1	5
Language Models Don't Always Say What They Think: Unfaithful Explanations in Chain-of-Thought Prompting	May 7, 2023	Multiple-choice	CodeCode Available	1	5
Enhancing Human-like Multi-Modal Reasoning: A New Challenging Dataset and Comprehensive Framework	Jul 24, 2023	Contrastive LearningMultimodal Reasoning	CodeCode Available	1	5
Enhancing Knowledge Tracing with Concept Map and Response Disentanglement	Aug 23, 2024	DisentanglementKnowledge Tracing	CodeCode Available	1	5
Boosting Healthcare LLMs Through Retrieved Context	Sep 23, 2024	BenchmarkingMultiple-choice	CodeCode Available	1	5
BRAINTEASER: Lateral Thinking Puzzles for Large Language Models	Oct 8, 2023	Distractor GenerationLanguage Modelling	CodeCode Available	1	5
ChartQAPro: A More Diverse and Challenging Benchmark for Chart Question Answering	Apr 7, 2025	Chart Question AnsweringChart Understanding	CodeCode Available	1	5
IndicNLPSuite: Monolingual Corpora, Evaluation Benchmarks and Pre-trained Multilingual Language Models for Indian Languages	Nov 8, 2020	Genre classificationMultiple-choice	CodeCode Available	1	5

Show:10 25 50

← PrevPage 19 of 111Next →

No leaderboard results yet.