SOTAVerified|Agents Browse Leaderboard About

Multiple-choice

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 101–110 of 1107 papers

Title	Date	Tasks	Status	Hype
InstructionBench: An Instructional Video Understanding Benchmark	Apr 7, 2025	Common Sense ReasoningMultiple-choice	—Unverified	0
Can AI Master Construction Management (CM)? Benchmarking State-of-the-Art Large Language Models on CM Certification Exams	Apr 4, 2025	BenchmarkingManagement	—Unverified	0
From ChatGPT to DeepSeek AI: A Comprehensive Analysis of Evolution, Deviation, and Future Implications in AI-Language Models	Apr 4, 2025	Multiple-choice	—Unverified	0
VEGAS: Towards Visually Explainable and Grounded Artificial Social Intelligence	Apr 3, 2025	Multiple-choice	CodeCode Available	0
Exploring the Effect of Reinforcement Learning on Video Understanding: Insights from SEED-Bench-R1	Mar 31, 2025	Logical ReasoningMultiple-choice	CodeCode Available	2
ACPBench Hard: Unrestrained Reasoning about Action, Change, and Planning	Mar 31, 2025	Multiple-choice	—Unverified	0
Question-Aware Knowledge Graph Prompting for Enhancing Large Language Models	Mar 30, 2025	Knowledge GraphsMultiple-choice	CodeCode Available	0
Order Independence With Finetuning	Mar 30, 2025	ARCLanguage Modeling	—Unverified	0
Mobile-MMLU: A Mobile Intelligence Language Understanding Benchmark	Mar 26, 2025	MMLUMultiple-choice	CodeCode Available	1
Language Model Uncertainty Quantification with Attention Chain	Mar 24, 2025	Computational EfficiencyLanguage Modeling	CodeCode Available	1

Show:10 25 50

← PrevPage 11 of 111Next →

No leaderboard results yet.