SOTAVerified

Multimodal Reasoning

Reasoning over multimodal inputs.

Papers

Showing 131140 of 302 papers

TitleStatusHype
Wait, We Don't Need to "Wait"! Removing Thinking Tokens Improves Reasoning Efficiency0
KokushiMD-10: Benchmark for Evaluating Large Language Models on Ten Japanese National Healthcare Licensing Examinations0
Decoupling the Image Perception and Multimodal Reasoning for Reasoning Segmentation with Digital Twin Representations0
Look Before You Leap: A GUI-Critic-R1 Model for Pre-Operative Error Diagnosis in GUI Automation0
MuSciClaims: Multimodal Scientific Claim Verification0
Advancing Multimodal Reasoning: From Optimized Cold Start to Staged Reinforcement Learning0
MMR-V: What's Left Unsaid? A Benchmark for Multimodal Deep Reasoning in Videos0
RSVP: Reasoning Segmentation via Visual Prompting and Multi-modal Chain-of-Thought0
SRPO: Enhancing Multimodal LLM Reasoning via Reflection-Aware Reinforcement Learning0
GThinker: Towards General Multimodal Reasoning via Cue-Guided RethinkingCode0
Show:102550
← PrevPage 14 of 31Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4VAccuracy24Unverified
2Gemini ProAccuracy13.2Unverified
3LLaVa-1.5-13BAccuracy1.8Unverified
4LLaVa-1.5-7BAccuracy1.5Unverified
5BLIP2-FLAN-T5-XXLAccuracy0.9Unverified
6QWENAccuracy0.9Unverified
7CogVLMAccuracy0.9Unverified
8InstructBLIPAccuracy0.6Unverified
#ModelMetricClaimedVerifiedStatus
1GPT4VAccuracy22.76Unverified
2Gemini ProAccuracy17.66Unverified
3Qwen-VL-MaxAccuracy15.59Unverified
4InternLM-XComposer2-VLAccuracy14.54Unverified
#ModelMetricClaimedVerifiedStatus
1GPT-4Acc30.3Unverified