SOTAVerified|Agents Browse Leaderboard About Blog

Visual Question Answering

MLLM Leaderboard

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 31–40 of 2177 papers

Title	Date	Tasks	Status	Hype
Qwen-VL: A Versatile Vision-Language Model for Understanding, Localization, Text Reading, and Beyond	Aug 24, 2023	Chart Question AnsweringFS-MEVQA	CodeCode Available	5
MMBench: Is Your Multi-modal Model an All-around Player?	Jul 12, 2023	AllInstruction Following	CodeCode Available	5
LLaMA-Adapter V2: Parameter-Efficient Visual Instruction Model	Apr 28, 2023	Instruction Followingmodel	CodeCode Available	5
Scaling Up Biomedical Vision-Language Models: Fine-Tuning, Instruction Tuning, and Multi-Modal Learning	May 23, 2025	DecoderImage Captioning	CodeCode Available	4
OpenDriveVLA: Towards End-to-end Autonomous Driving with Large Vision Language Action Model	Mar 30, 2025	Autonomous DrivingDecision Making	CodeCode Available	4
A Survey on Vision-Language-Action Models for Embodied AI	May 23, 2024	Image CaptioningInstruction Following	CodeCode Available	4
OmniDrive: A Holistic Vision-Language Dataset for Autonomous Driving with Counterfactual Reasoning	May 2, 2024	Autonomous Drivingcounterfactual	CodeCode Available	4
The All-Seeing Project V2: Towards General Relation Comprehension of the Open World	Feb 29, 2024	AllHallucination	CodeCode Available	4
TinyLLaVA: A Framework of Small-scale Large Multimodal Models	Feb 22, 2024	Visual Question Answering	CodeCode Available	4
OmniMedVQA: A New Large-Scale Comprehensive Evaluation Benchmark for Medical LVLM	Feb 14, 2024	Medical Visual Question AnsweringQuestion Answering	CodeCode Available	4

Show:10 25 50

← PrevPage 4 of 218Next →

All datasets MM-Vet ViP-Bench VQA v2 test-dev BenchLMM MMBench V*bench VQA v2 val MSRVTT-QA VQA v2 test-std MMHal-Bench MSVD-QA PlotQA-D1

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	MMCTAgent (GPT-4 + GPT-4V)	GPT-4 score	74.24	—	Unverified
2	Qwen2-VL-72B	GPT-4 score	74	—	Unverified
3	InternVL2.5-78B	GPT-4 score	72.3	—	Unverified
4	GPT-4o +text rationale +IoT	GPT-4 score	72.2	—	Unverified
5	Lyra-Pro	GPT-4 score	71.4	—	Unverified
6	GLM-4V-Plus	GPT-4 score	71.1	—	Unverified
7	Phantom-7B	GPT-4 score	70.8	—	Unverified
8	InternVL2.5-38B	GPT-4 score	68.8	—	Unverified
9	InternVL2-26B (SGP, token ratio 64%)	GPT-4 score	65.6	—	Unverified
10	Baichuan-Omni (7B)	GPT-4 score	65.4	—	Unverified