SOTAVerified|Agents Browse Leaderboard About

Multiple-choice

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 611–620 of 1107 papers

Title	Date	Tasks	Status	Hype
FoundaBench: Evaluating Chinese Fundamental Knowledge Capabilities of Large Language Models	Apr 29, 2024	Common Sense ReasoningMultiple-choice	—Unverified	0
Framing QA as Building and Ranking Intersentence Answer Justifications	Jun 1, 2017	Multiple-choiceQuestion Answering	—Unverified	0
From ChatGPT to DeepSeek AI: A Comprehensive Analysis of Evolution, Deviation, and Future Implications in AI-Language Models	Apr 4, 2025	Multiple-choice	—Unverified	0
From 'F' to 'A' on the N.Y. Regents Science Exams: An Overview of the Aristo Project	Sep 4, 2019	Multiple-choiceQuestion Answering	—Unverified	0
From Generalist to Specialist: Improving Large Language Models for Medical Physics Using ARCoT	May 17, 2024	BenchmarkingMultiple-choice	—Unverified	0
SHARP: Unlocking Interactive Hallucination via Stance Transfer in Role-Playing Agents	Nov 12, 2024	General KnowledgeHallucination	—Unverified	0
Fundamental Limitations in Defending LLM Finetuning APIs	Feb 20, 2025	Multiple-choice	—Unverified	0
FusionMind -- Improving question and answering with external context fusion	Dec 31, 2023	Knowledge GraphsMultiple-choice	—Unverified	0
GANDALF: a General Character Name Description Dataset for Long Fiction	Nov 1, 2021	Multiple-choiceQuestion Answering	—Unverified	0
GEMeX: A Large-Scale, Groundable, and Explainable Medical VQA Benchmark for Chest X-ray Diagnosis	Nov 25, 2024	Medical Visual Question AnsweringMultiple-choice	—Unverified	0

Show:10 25 50

← PrevPage 62 of 111Next →

No leaderboard results yet.