SOTAVerified

Visual Question Answering

MLLM Leaderboard

Papers

Showing 21512177 of 2177 papers

TitleStatusHype
Multimodal Compact Bilinear Pooling for Visual Question Answering and Visual GroundingCode0
Multimodal Residual Learning for Visual QACode0
Answer-Type Prediction for Visual Question Answering0
Hierarchical Question-Image Co-Attention for Visual Question AnsweringCode1
End-to-End Instance Segmentation with Recurrent AttentionCode0
Ask Your Neurons: A Deep Learning Approach to Visual Question AnsweringCode0
Leveraging Visual Question Answering for Image-Caption Ranking0
Learning Models for Actions and Person-Object Interactions with Transfer to Question Answering0
Counting Everyday Objects in Everyday ScenesCode0
A Focused Dynamic Attention Model for Visual Question Answering0
Neural Attention Models for Sequence Classification: Analysis and Application to Key Term Extraction and Dialogue Act Detection0
Image Captioning and Visual Question Answering Based on Attributes and External Knowledge0
Dynamic Memory Networks for Visual and Textual Question AnsweringCode0
Neural Self Talk: Image Understanding via Continuous Questioning and Answering0
Simple Baseline for Visual Question AnsweringCode0
A Restricted Visual Turing Test for Deep Scene and Event Understanding0
Where To Look: Focus Regions for Visual Question Answering0
Ask Me Anything: Free-form Visual Question Answering Based on Knowledge from External Sources0
Compositional Memory for Visual Question Answering0
ABC-CNN: An Attention Based Convolutional Neural Network for Visual Question Answering0
Ask, Attend and Answer: Exploring Question-Guided Spatial Attention for Visual Question AnsweringCode0
Yin and Yang: Balancing and Answering Binary Visual Questions0
Visual7W: Grounded Question Answering in Images0
Explicit Knowledge-based Reasoning for Visual Question Answering0
Neural Module NetworksCode0
What value do explicit high level concepts have in vision to language problems?Code0
VQA: Visual Question AnsweringCode1
Show:102550
← PrevPage 44 of 44Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1MMCTAgent (GPT-4 + GPT-4V)GPT-4 score74.24Unverified
2Qwen2-VL-72BGPT-4 score74Unverified
3InternVL2.5-78BGPT-4 score72.3Unverified
4GPT-4o +text rationale +IoTGPT-4 score72.2Unverified
5Lyra-ProGPT-4 score71.4Unverified
6GLM-4V-PlusGPT-4 score71.1Unverified
7Phantom-7BGPT-4 score70.8Unverified
8InternVL2.5-38BGPT-4 score68.8Unverified
9InternVL2-26B (SGP, token ratio 64%)GPT-4 score65.6Unverified
10Baichuan-Omni (7B)GPT-4 score65.4Unverified