SOTAVerified

Visual Question Answering

MLLM Leaderboard

Papers

Showing 12511275 of 2177 papers

TitleStatusHype
Multimodal Commonsense Knowledge Distillation for Visual Question Answering0
VisionGPT: Vision-Language Understanding Agent Using Generalized Multimodal Framework0
Multimodal Compact Bilinear Pooling for Multimodal Neural Machine Translation0
Multimodal Continuous Visual Attention Mechanisms0
Multi-modal Deep Analysis for Multimedia0
Multi-Modal Explainable Medical AI Assistant for Trustworthy Human-AI Collaboration0
Vision-Language Models as Success Detectors0
Vision Language Models Can Parse Floor Plan Maps0
Does my multimodal model learn cross-modal interactions? It's harder to tell than you might think!0
Multimodal Few-Shot Learning with Frozen Language Models0
Document Visual Question Answering Challenge 20200
Multi-Modal Fusion Transformer for Visual Question Answering in Remote Sensing0
Multimodal Graph Networks for Compositional Generalization in Visual Question Answering0
Multimodal grid features and cell pointers for Scene Text Visual Question Answering0
Multi-Modal Instruction-Tuning Small-Scale Language-and-Vision Assistant for Semiconductor Electron Micrograph Analysis0
Multimodal Integration of Human-Like Attention in Visual Question Answering0
Multimodal Intelligence: Representation Learning, Information Fusion, and Applications0
Document Collection Visual Question Answering0
Multi-modality Latent Interaction Network for Visual Question Answering0
Document AI: Benchmarks, Models and Applications0
Vision-Language Models for Edge Networks: A Comprehensive Survey0
Multimodal Learning and Reasoning for Visual Question Answering0
Scene Graph Reasoning with Prior Visual Relationship for Visual Question Answering0
Multimodal Neural Graph Memory Networks for Visual Question Answering0
DLIP: Distilling Language-Image Pre-training0
Show:102550
← PrevPage 51 of 88Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1MMCTAgent (GPT-4 + GPT-4V)GPT-4 score74.24Unverified
2Qwen2-VL-72BGPT-4 score74Unverified
3InternVL2.5-78BGPT-4 score72.3Unverified
4GPT-4o +text rationale +IoTGPT-4 score72.2Unverified
5Lyra-ProGPT-4 score71.4Unverified
6GLM-4V-PlusGPT-4 score71.1Unverified
7Phantom-7BGPT-4 score70.8Unverified
8InternVL2.5-38BGPT-4 score68.8Unverified
9InternVL2-26B (SGP, token ratio 64%)GPT-4 score65.6Unverified
10Baichuan-Omni (7B)GPT-4 score65.4Unverified