SOTAVerified

Visual Question Answering

MLLM Leaderboard

Papers

Showing 21612170 of 2177 papers

TitleStatusHype
Neural Attention Models for Sequence Classification: Analysis and Application to Key Term Extraction and Dialogue Act Detection0
Image Captioning and Visual Question Answering Based on Attributes and External Knowledge0
Dynamic Memory Networks for Visual and Textual Question AnsweringCode0
Neural Self Talk: Image Understanding via Continuous Questioning and Answering0
Simple Baseline for Visual Question AnsweringCode0
A Restricted Visual Turing Test for Deep Scene and Event Understanding0
Where To Look: Focus Regions for Visual Question Answering0
Ask Me Anything: Free-form Visual Question Answering Based on Knowledge from External Sources0
Compositional Memory for Visual Question Answering0
ABC-CNN: An Attention Based Convolutional Neural Network for Visual Question Answering0
Show:102550
← PrevPage 217 of 218Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1MMCTAgent (GPT-4 + GPT-4V)GPT-4 score74.24Unverified
2Qwen2-VL-72BGPT-4 score74Unverified
3InternVL2.5-78BGPT-4 score72.3Unverified
4GPT-4o +text rationale +IoTGPT-4 score72.2Unverified
5Lyra-ProGPT-4 score71.4Unverified
6GLM-4V-PlusGPT-4 score71.1Unverified
7Phantom-7BGPT-4 score70.8Unverified
8InternVL2.5-38BGPT-4 score68.8Unverified
9InternVL2-26B (SGP, token ratio 64%)GPT-4 score65.6Unverified
10Baichuan-Omni (7B)GPT-4 score65.4Unverified