SOTAVerified

Visual Question Answering

MLLM Leaderboard

Papers

Showing 10111020 of 2177 papers

TitleStatusHype
Dual Recurrent Attention Units for Visual Question AnsweringCode0
Logical Implications for Visual Question Answering ConsistencyCode0
Learning to Follow Object-Centric Image Editing Instructions FaithfullyCode0
Learning to Localize Objects Improves Spatial Reasoning in Visual-LLMsCode0
Effective Approaches to Batch Parallelization for Dynamic Neural Network ArchitecturesCode0
Learning to Reason: End-to-End Module Networks for Visual Question AnsweringCode0
DVQA: Understanding Data Visualizations via Question AnsweringCode0
LLM-Assisted Multi-Teacher Continual Learning for Visual Question Answering in Robotic SurgeryCode0
Making the V in VQA Matter: Elevating the Role of Image Understanding in Visual Question AnsweringCode0
A Question-Centric Model for Visual Question Answering in Medical ImagingCode0
Show:102550
← PrevPage 102 of 218Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1MMCTAgent (GPT-4 + GPT-4V)GPT-4 score74.24Unverified
2Qwen2-VL-72BGPT-4 score74Unverified
3InternVL2.5-78BGPT-4 score72.3Unverified
4GPT-4o +text rationale +IoTGPT-4 score72.2Unverified
5Lyra-ProGPT-4 score71.4Unverified
6GLM-4V-PlusGPT-4 score71.1Unverified
7Phantom-7BGPT-4 score70.8Unverified
8InternVL2.5-38BGPT-4 score68.8Unverified
9InternVL2-26B (SGP, token ratio 64%)GPT-4 score65.6Unverified
10Baichuan-Omni (7B)GPT-4 score65.4Unverified