SOTAVerified|Agents Browse Leaderboard About Blog

Multiple-choice

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1–10 of 1107 papers

Title	Date	Tasks	Status	Hype	Score
MiniCPM-V: A GPT-4V Level MLLM on Your Phone	Aug 3, 2024	HallucinationMultiple-choice	CodeCode Available	12	5
HealthBench: Evaluating Large Language Models Towards Improved Human Health	May 13, 2025	Instruction FollowingMultiple-choice	CodeCode Available	7	5
MMBench: Is Your Multi-modal Model an All-around Player?	Jul 12, 2023	AllInstruction Following	CodeCode Available	5	5
LongBench v2: Towards Deeper Understanding and Reasoning on Realistic Long-context Multitasks	Dec 19, 2024	8kIn-Context Learning	CodeCode Available	5	5
VideoLLaMA 2: Advancing Spatial-Temporal Modeling and Audio Understanding in Video-LLMs	Jun 11, 2024	Multiple-choiceQuestion Answering	CodeCode Available	5	5
Flamingo: a Visual Language Model for Few-Shot Learning	Apr 29, 2022	Few-Shot LearningGenerative Visual Question Answering	CodeCode Available	4	5
BioMedLM: A 2.7B Parameter Language Model Trained On Biomedical Text	Mar 27, 2024	ArticlesLanguage Modeling	CodeCode Available	4	5
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models	Jan 30, 2023	Generative Visual Question AnsweringImage Captioning	CodeCode Available	4	5
I Think, Therefore I am: Benchmarking Awareness of Large Language Models Using AwareBench	Jan 31, 2024	BenchmarkingMultiple-choice	CodeCode Available	4	5
MiniGPT4-Video: Advancing Multimodal LLMs for Video Understanding with Interleaved Visual-Textual Tokens	Apr 4, 2024	Language ModelingLanguage Modelling	CodeCode Available	4	5

Show:10 25 50

← PrevPage 1 of 111Next →

No leaderboard results yet.