SOTAVerified

Visual Question Answering

MLLM Leaderboard

Papers

Showing 761770 of 2177 papers

TitleStatusHype
Curriculum Learning for Compositional Visual Reasoning0
Curriculum Learning Effectively Improves Low Data VQA0
An Empirical Study of Batch Normalization and Group Normalization in Conditional Computation0
Prompting Medical Large Vision-Language Models to Diagnose Pathologies by Visual Question Answering0
Interactive Visual Task Learning for Robots0
CTRL-O: Language-Controllable Object-Centric Visual Representation Learning0
Barking Up The Syntactic Tree: Enhancing VLM Training with Syntactic Losses0
CT-Agent: A Multimodal-LLM Agent for 3D CT Radiology Question Answering0
CS-VQA: Visual Question Answering with Compressively Sensed Images0
Balancing Performance and Efficiency in Zero-shot Robotic Navigation0
Show:102550
← PrevPage 77 of 218Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1MMCTAgent (GPT-4 + GPT-4V)GPT-4 score74.24Unverified
2Qwen2-VL-72BGPT-4 score74Unverified
3InternVL2.5-78BGPT-4 score72.3Unverified
4GPT-4o +text rationale +IoTGPT-4 score72.2Unverified
5Lyra-ProGPT-4 score71.4Unverified
6GLM-4V-PlusGPT-4 score71.1Unverified
7Phantom-7BGPT-4 score70.8Unverified
8InternVL2.5-38BGPT-4 score68.8Unverified
9InternVL2-26B (SGP, token ratio 64%)GPT-4 score65.6Unverified
10Baichuan-Omni (7B)GPT-4 score65.4Unverified