SOTAVerified

Multiple-choice

Papers

Showing 251260 of 1107 papers

TitleStatusHype
Unsupervised Commonsense Question Answering with Self-TalkCode1
R2DE: a NLP approach to estimating IRT parameters of newly generated questionsCode1
WIQA: A dataset for "What if..." reasoning over procedural textCode1
CommonsenseQA: A Question Answering Challenge Targeting Commonsense KnowledgeCode1
Generating Distractors for Reading Comprehension Questions from Real ExaminationsCode1
Constructing Narrative Event Evolutionary Graph for Script Event PredictionCode1
VQA: Visual Question AnsweringCode1
The Generative Energy Arena (GEA): Incorporating Energy Awareness in Large Language Model (LLM) Human Evaluations0
HATS: Hindi Analogy Test Set for Evaluating Reasoning in Large Language Models0
MateInfoUB: A Real-World Benchmark for Testing LLMs in Competitive, Multilingual, and Multimodal Educational Tasks0
Show:102550
← PrevPage 26 of 111Next →

No leaderboard results yet.