SOTAVerified

Multimodal Reasoning

Reasoning over multimodal inputs.

Papers

Showing 5160 of 302 papers

TitleStatusHype
Distill Visual Chart Reasoning Ability from LLMs to MLLMsCode2
HumanOmniV2: From Understanding to Omni-Modal Reasoning with ContextCode2
Learn to Explain: Multimodal Reasoning via Thought Chains for Science Question AnsweringCode2
Multimodal Analogical Reasoning over Knowledge GraphsCode2
Accelerating Multimodal Large Language Models via Dynamic Visual-Token Exit and the Empirical FindingsCode1
How Do Multimodal Large Language Models Handle Complex Multimodal Reasoning? Placing Them in An Extensible Escape GameCode1
Advancing Multimodal Reasoning via Reinforcement Learning with Cold StartCode1
ARB: A Comprehensive Arabic Multimodal Reasoning BenchmarkCode1
HaloQuest: A Visual Hallucination Dataset for Advancing Multimodal ReasoningCode1
MAmmoTH-VL: Eliciting Multimodal Reasoning with Instruction Tuning at ScaleCode1
Show:102550
← PrevPage 6 of 31Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4VAccuracy24Unverified
2Gemini ProAccuracy13.2Unverified
3LLaVa-1.5-13BAccuracy1.8Unverified
4LLaVa-1.5-7BAccuracy1.5Unverified
5BLIP2-FLAN-T5-XXLAccuracy0.9Unverified
6QWENAccuracy0.9Unverified
7CogVLMAccuracy0.9Unverified
8InstructBLIPAccuracy0.6Unverified
#ModelMetricClaimedVerifiedStatus
1GPT4VAccuracy22.76Unverified
2Gemini ProAccuracy17.66Unverified
3Qwen-VL-MaxAccuracy15.59Unverified
4InternLM-XComposer2-VLAccuracy14.54Unverified
#ModelMetricClaimedVerifiedStatus
1GPT-4Acc30.3Unverified