SOTAVerified|Agents Browse Leaderboard About

Multiple-choice

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 61–70 of 1107 papers

Title	Date	Tasks	Status	Hype	Score
Towards Evaluating and Building Versatile Large Language Models for Medicine	Aug 22, 2024	Multiple-choicenamed-entity-recognition	CodeCode Available	2	5
Understanding Long Videos with Multimodal Language Models	Mar 25, 2024	Action RecognitionFine-grained Action Recognition	CodeCode Available	2	5
HourVideo: 1-Hour Video-Language Understanding	Nov 7, 2024	Benchmarkingcounterfactual	CodeCode Available	2	5
Biomedical knowledge graph-optimized prompt generation for large language models	Nov 29, 2023	BenchmarkingKnowledge Graphs	CodeCode Available	2	5
Exploring the Effect of Reinforcement Learning on Video Understanding: Insights from SEED-Bench-R1	Mar 31, 2025	Logical ReasoningMultiple-choice	CodeCode Available	2	5
An Image Grid Can Be Worth a Video: Zero-shot Video Question Answering Using a VLM	Mar 27, 2024	Language ModelingLanguage Modelling	CodeCode Available	2	5
FinEval: A Chinese Financial Domain Knowledge Evaluation Benchmark for Large Language Models	Aug 19, 2023	Multiple-choice	CodeCode Available	2	5
Improving Medical Reasoning through Retrieval and Self-Reflection with Retrieval-Augmented Large Language Models	Jan 27, 2024	Medical Question AnsweringMultiple-choice	CodeCode Available	2	5
MedMCQA : A Large-scale Multi-Subject Multi-Choice Dataset for Medical domain Question Answering	Mar 27, 2022	DiversityMultiple-choice	CodeCode Available	2	5
Neptune: The Long Orbit to Benchmarking Long Video Understanding	Dec 12, 2024	BenchmarkingMultimodal Reasoning	CodeCode Available	2	5

Show:10 25 50

← PrevPage 7 of 111Next →

No leaderboard results yet.