SOTAVerified

Visual Question Answering

MLLM Leaderboard

Papers

Showing 651675 of 2177 papers

TitleStatusHype
Hypo3D: Exploring Hypothetical Reasoning in 3D0
Does Object Grounding Really Reduce Hallucination of Large Vision-Language Models?0
Does my multimodal model learn cross-modal interactions? It's harder to tell than you might think!0
Boosting Cross-task Transferability of Adversarial Patches with Visual Relations0
Answer-Me: Multi-Task Open-Vocabulary Visual Question Answering0
BOK-VQA: Bilingual outside Knowledge-Based Visual Question Answering via Graph Representation Pretraining0
Document Visual Question Answering Challenge 20200
Document Collection Visual Question Answering0
Blocks as Probes: Dissecting Categorization Ability of Large Multimodal Models0
Document AI: Benchmarks, Models and Applications0
A Cognitive Evaluation Benchmark of Image Reasoning and Description for Large Vision-Language Models0
Human-centered Interactive Learning via MLLMs for Text-to-Image Person Re-identification0
DLIP: Distilling Language-Image Pre-training0
Generating Question Relevant Captions to Aid Visual Question Answering0
Human Mobility Question Answering (Vision Paper)0
Diversity and Consistency: Exploring Visual Question-Answer Pair Generation0
Diversifying Joint Vision-Language Tokenization Learning0
DistilDoc: Knowledge Distillation for Visually-Rich Document Applications0
Adversarial Attacks Beyond the Image Space0
Hummingbird: High Fidelity Image Generation via Multimodal Context Alignment0
Disentangling Knowledge-based and Visual Reasoning by Question Decomposition in KB-VQA0
Adventurer's Treasure Hunt: A Transparent System for Visually Grounded Compositional Visual Question Answering based on Scene Graphs0
HRVQA: A Visual Question Answering Benchmark for High-Resolution Aerial Images0
Answer-checking in Context: A Multi-modal FullyAttention Network for Visual Question Answering0
Human-Adversarial Visual Question Answering0
Show:102550
← PrevPage 27 of 88Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1MMCTAgent (GPT-4 + GPT-4V)GPT-4 score74.24Unverified
2Qwen2-VL-72BGPT-4 score74Unverified
3InternVL2.5-78BGPT-4 score72.3Unverified
4GPT-4o +text rationale +IoTGPT-4 score72.2Unverified
5Lyra-ProGPT-4 score71.4Unverified
6GLM-4V-PlusGPT-4 score71.1Unverified
7Phantom-7BGPT-4 score70.8Unverified
8InternVL2.5-38BGPT-4 score68.8Unverified
9InternVL2-26B (SGP, token ratio 64%)GPT-4 score65.6Unverified
10Baichuan-Omni (7B)GPT-4 score65.4Unverified