SOTAVerified|Agents Browse Leaderboard About

Visual Question Answering

MLLM Leaderboard

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 301–310 of 2177 papers

Title	Date	Tasks	Status	Hype
LIVE: Learnable In-Context Vector for Visual Question Answering	Jun 19, 2024	In-Context LearningQuestion Answering	CodeCode Available	1
MFC-Bench: Benchmarking Multimodal Fact-Checking with Large Vision-Language Models	Jun 17, 2024	BenchmarkingFact Checking	CodeCode Available	1
MMNeuron: Discovering Neuron-Level Domain-Specific Interpretation in Multimodal Large Language Model	Jun 17, 2024	Language ModelingLanguage Modelling	CodeCode Available	1
Vision-Language Models Meet Meteorology: Developing Models for Extreme Weather Events Detection with Heatmaps	Jun 14, 2024	Question AnsweringVisual Question Answering	CodeCode Available	1
VANE-Bench: Video Anomaly Evaluation Benchmark for Conversational LMMs	Jun 14, 2024	Anomaly DetectionBenchmarking	CodeCode Available	1
Advancing High Resolution Vision-Language Models in Biomedicine	Jun 12, 2024	Language ModelingLanguage Modelling	CodeCode Available	1
VCR: A Task for Pixel-Level Complex Reasoning in Vision Language Models via Restoring Occluded Text	Jun 10, 2024	Language ModelingLanguage Modelling	CodeCode Available	1
Re-ReST: Reflection-Reinforced Self-Training for Language Agents	Jun 3, 2024	Code GenerationImage Generation	CodeCode Available	1
Instruction-Guided Visual Masking	May 30, 2024	Instruction FollowingVisual Grounding	CodeCode Available	1
Worse than Random? An Embarrassingly Simple Probing Evaluation of Large Multimodal Models in Medical VQA	May 30, 2024	DiagnosticMedical Diagnosis	CodeCode Available	1

Show:10 25 50

← PrevPage 31 of 218Next →

All datasets MM-Vet ViP-Bench VQA v2 test-dev BenchLMM MMBench V*bench VQA v2 val MSRVTT-QA VQA v2 test-std MMHal-Bench MSVD-QA PlotQA-D1

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	MMCTAgent (GPT-4 + GPT-4V)	GPT-4 score	74.24	—	Unverified
2	Qwen2-VL-72B	GPT-4 score	74	—	Unverified
3	InternVL2.5-78B	GPT-4 score	72.3	—	Unverified
4	GPT-4o +text rationale +IoT	GPT-4 score	72.2	—	Unverified
5	Lyra-Pro	GPT-4 score	71.4	—	Unverified
6	GLM-4V-Plus	GPT-4 score	71.1	—	Unverified
7	Phantom-7B	GPT-4 score	70.8	—	Unverified
8	InternVL2.5-38B	GPT-4 score	68.8	—	Unverified
9	InternVL2-26B (SGP, token ratio 64%)	GPT-4 score	65.6	—	Unverified
10	Baichuan-Omni (7B)	GPT-4 score	65.4	—	Unverified