SOTAVerified

Visual Question Answering

MLLM Leaderboard

Papers

Showing 17011725 of 2177 papers

TitleStatusHype
Breaking Neural Network Scaling Laws with Modularity0
Spatial Attention as an Interface for Image Captioning Models0
Spatial Knowledge Distillation to aid Visual Reasoning0
SpatialReasoner: Towards Explicit and Generalizable 3D Spatial Reasoning0
SpatialVLM: Endowing Vision-Language Models with Spatial Reasoning Capabilities0
Advancing Surgical VQA with Scene Graph Knowledge0
Breaking Down Questions for Outside-Knowledge Visual Question Answering0
Breaking Down Questions for Outside-Knowledge VQA0
SplatTalk: 3D VQA with Gaussian Splatting0
Breaking Common Sense: WHOOPS! A Vision-and-Language Benchmark of Synthetic and Compositional Images0
Boosting Cross-task Transferability of Adversarial Patches with Visual Relations0
Stacked Latent Attention for Multimodal Reasoning0
Stacking with Auxiliary Features for Visual Question Answering0
StackOverflowVQA: Stack Overflow Visual Question Answering Dataset0
Steering LVLMs via Sparse Autoencoder for Hallucination Mitigation0
BOK-VQA: Bilingual outside Knowledge-Based Visual Question Answering via Graph Representation Pretraining0
Blocks as Probes: Dissecting Categorization Ability of Large Multimodal Models0
Story Generation from Visual Inputs: Techniques, Related Tasks, and Challenges0
Straight to the Facts: Learning Knowledge Base Retrieval for Factual Visual Question Answering0
Strengthening Multimodal Large Language Model with Bootstrapped Preference Optimization0
StructuralLM: Structural Pre-training for Form Understanding0
Structure Causal Models and LLMs Integration in Medical Visual Question Answering0
Advancing Multimodal Medical Capabilities of Gemini0
xGQA: Cross-Lingual Visual Question Answering0
Structured Two-stream Attention Network for Video Question Answering0
Show:102550
← PrevPage 69 of 88Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1MMCTAgent (GPT-4 + GPT-4V)GPT-4 score74.24Unverified
2Qwen2-VL-72BGPT-4 score74Unverified
3InternVL2.5-78BGPT-4 score72.3Unverified
4GPT-4o +text rationale +IoTGPT-4 score72.2Unverified
5Lyra-ProGPT-4 score71.4Unverified
6GLM-4V-PlusGPT-4 score71.1Unverified
7Phantom-7BGPT-4 score70.8Unverified
8InternVL2.5-38BGPT-4 score68.8Unverified
9InternVL2-26B (SGP, token ratio 64%)GPT-4 score65.6Unverified
10Baichuan-Omni (7B)GPT-4 score65.4Unverified