SOTAVerified

Visual Reasoning

Ability to understand actions and reasoning associated with any visual images

Papers

Showing 110 of 698 papers

TitleStatusHype
LaViPlan : Language-Guided Visual Path Planning with RLVR0
Beyond Task-Specific Reasoning: A Unified Conditional Generative Framework for Abstract Visual ReasoningCode0
PyVision: Agentic Vision with Dynamic Tooling0
MagiC: Evaluating Multimodal Cognition Toward Grounded Visual Reasoning0
Orchestrator-Agent Trust: A Modular Agentic AI Visual Classification System with Trust-Aware Orchestration and RAG-Based ReasoningCode0
High-Resolution Visual Reasoning via Multi-Turn Grounding-Based Reinforcement LearningCode2
Skywork-R1V3 Technical ReportCode7
Open Vision Reasoner: Transferring Linguistic Cognitive Behavior for Visual Reasoning0
Foundation Models for Zero-Shot Segmentation of Scientific Images without AI-Ready Data0
MiCo: Multi-image Contrast for Reinforcement Visual Reasoning0
Show:102550
← PrevPage 1 of 70Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1Gemini-2.0 + CA2-Class Accuracy93.6Unverified
2GPT-4o + CA2-Class Accuracy92.8Unverified
3Human2-Class Accuracy91Unverified
4SNAIL2-Class Accuracy64Unverified
5InstructBLIP + GPT-42-Class Accuracy63.8Unverified
6BLIP-2 + ChatGPT (Fine-tuned)2-Class Accuracy63.3Unverified
7InstructBLIP + ChatGPT + Neuro-Symbolic2-Class Accuracy55.5Unverified
8ChatCaptioner + ChatGPT2-Class Accuracy49.3Unverified
9Otter2-Class Accuracy49.3Unverified