SOTAVerified

Visual Question Answering

MLLM Leaderboard

Papers

Showing 601610 of 2177 papers

TitleStatusHype
Lever LM: Configuring In-Context Sequence to Lever Large Vision Language ModelsCode1
A-OKVQA: A Benchmark for Visual Question Answering using World KnowledgeCode1
Dynamic Multimodal Evaluation with Flexible Complexity by Vision-Language BootstrappingCode1
GPT-4V-AD: Exploring Grounding Potential of VQA-oriented GPT-4V for Zero-shot Anomaly DetectionCode1
GRIT: General Robust Image Task BenchmarkCode1
MixPHM: Redundancy-Aware Parameter-Efficient Tuning for Low-Resource Visual Question AnsweringCode1
Calibrating Concepts and Operations: Towards Symbolic Reasoning on Real ImagesCode1
Greedy Gradient Ensemble for Robust Visual Question AnsweringCode1
Dynamic Language Binding in Relational Visual ReasoningCode1
Expressive Scene Graph Generation Using Commonsense Knowledge Infusion for Visual Understanding and ReasoningCode1
Show:102550
← PrevPage 61 of 218Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1MMCTAgent (GPT-4 + GPT-4V)GPT-4 score74.24Unverified
2Qwen2-VL-72BGPT-4 score74Unverified
3InternVL2.5-78BGPT-4 score72.3Unverified
4GPT-4o +text rationale +IoTGPT-4 score72.2Unverified
5Lyra-ProGPT-4 score71.4Unverified
6GLM-4V-PlusGPT-4 score71.1Unverified
7Phantom-7BGPT-4 score70.8Unverified
8InternVL2.5-38BGPT-4 score68.8Unverified
9InternVL2-26B (SGP, token ratio 64%)GPT-4 score65.6Unverified
10Baichuan-Omni (7B)GPT-4 score65.4Unverified