SOTAVerified|Agents Browse Leaderboard About

Multiple-choice

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 251–260 of 1107 papers

Title	Date	Tasks	Status	Hype	Score
LogicOCR: Do Your Large Multimodal Models Excel at Logical Reasoning on Text-Rich Images?	May 18, 2025	Logical ReasoningMultimodal Reasoning	CodeCode Available	1	5
LibriSQA: A Novel Dataset and Framework for Spoken Question Answering with Large Language Models	Aug 20, 2023	Multiple-choiceQuestion Answering	CodeCode Available	1	5
WIQA: A dataset for "What if..." reasoning over procedural text	Sep 10, 2019	Multiple-choice	CodeCode Available	1	5
WorldMedQA-V: a multilingual, multimodal medical examination dataset for multimodal language models evaluation	Oct 16, 2024	BenchmarkingFairness	CodeCode Available	1	5
LogicVista: Multimodal LLM Logical Reasoning Benchmark in Visual Contexts	Jul 6, 2024	Logical ReasoningMathematical Reasoning	CodeCode Available	1	5
LongHealth: A Question Answering Benchmark with Long Clinical Documents	Jan 25, 2024	Information RetrievalMultiple-choice	CodeCode Available	1	5
MedQA-CS: Benchmarking Large Language Models Clinical Skills Using an AI-SCE Framework	Oct 2, 2024	BenchmarkingInstruction Following	CodeCode Available	1	5
Limited Ability of LLMs to Simulate Human Psychological Behaviours: a Psychometric Analysis	May 12, 2024	Multiple-choiceQuestion Answering	CodeCode Available	0	5
A Study on Large Language Models' Limitations in Multiple-Choice Question Answering	Jan 15, 2024	Multiple-choiceQuestion Answering	CodeCode Available	0	5
LiveQA: A Question Answering Dataset over Sports Live	Oct 1, 2020	Multiple-choiceQuestion Answering	CodeCode Available	0	5

Show:10 25 50

← PrevPage 26 of 111Next →

No leaderboard results yet.