SOTAVerified|Agents Browse Leaderboard About

Multiple-choice

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 331–340 of 1107 papers

Title	Date	Tasks	Status	Hype	Score
HSI: Head-Specific Intervention Can Induce Misaligned AI Coordination in Large Language Models	Feb 9, 2025	Answer GenerationLanguage Modeling	CodeCode Available	0	5
Beyond English-Only Reading Comprehension: Experiments in Zero-Shot Multilingual Transfer for Bulgarian	Aug 5, 2019	Multiple-choicePhilosophy	CodeCode Available	0	5
EAIRA: Establishing a Methodology for Evaluating AI Models as Scientific Research Assistants	Feb 27, 2025	Multiple-choice	CodeCode Available	0	5
KGQuiz: Evaluating the Generalization of Encoded Knowledge in Large Language Models	Oct 15, 2023	Multiple-choiceTriplet	CodeCode Available	0	5
DyePack: Provably Flagging Test Set Contamination in LLMs Using Backdoors	May 29, 2025	MMLUMultiple-choice	CodeCode Available	0	5
Kaleidoscope: In-language Exams for Massively Multilingual Vision Evaluation	Apr 9, 2025	Multiple-choice	CodeCode Available	0	5
It's Not Easy Being Wrong: Large Language Models Struggle with Process of Elimination Reasoning	Nov 13, 2023	Multiple-choice	CodeCode Available	0	5
BERT-based distractor generation for Swedish reading comprehension questions using a small-scale dataset	Aug 9, 2021	Distractor GenerationMultiple-choice	CodeCode Available	0	5
Edu-Values: Towards Evaluating the Chinese Education Values of Large Language Models	Sep 19, 2024	EthicsMultiple-choice	CodeCode Available	0	5
DREAM: A Challenge Dataset and Models for Dialogue-Based Reading Comprehension	Feb 1, 2019	Dialogue UnderstandingMultiple-choice	CodeCode Available	0	5

Show:10 25 50

← PrevPage 34 of 111Next →

No leaderboard results yet.