| Visual Reasoning Evaluation of Grok, Deepseek Janus, Gemini, Qwen, Mistral, and ChatGPT | Feb 23, 2025 | Bias DetectionVisual Reasoning | —Unverified | 0 |
| Visual Reasoning of Feature Attribution with Deep Recurrent Neural Networks | Jan 17, 2019 | ClassificationGeneral Classification | —Unverified | 0 |
| Visual Reasoning with Natural Language | Oct 2, 2017 | DescriptiveDiversity | —Unverified | 0 |
| Visual Structures Helps Visual Reasoning: Addressing the Binding Problem in VLMs | Jun 27, 2025 | Visual Reasoning | —Unverified | 0 |
| VisualToolAgent (VisTA): A Reinforcement Learning Framework for Visual Tool Selection | May 26, 2025 | Diversityreinforcement-learning | —Unverified | 0 |
| VisuLogic: A Benchmark for Evaluating Visual Reasoning in Multi-modal Large Language Models | Apr 21, 2025 | AttributeVisual Reasoning | —Unverified | 0 |
| ViUniT: Visual Unit Tests for More Robust Visual Programming | Dec 12, 2024 | Image GenerationImage-text matching | —Unverified | 0 |
| VL-BEiT: Generative Vision-Language Pretraining | Jun 2, 2022 | image-classificationImage Classification | —Unverified | 0 |
| VLM Q-Learning: Aligning Vision-Language Models for Interactive Decision-Making | May 6, 2025 | Decision MakingGeneral Knowledge | —Unverified | 0 |
| VLM@school -- Evaluation of AI image understanding on German middle school knowledge | Jun 13, 2025 | Visual Reasoning | —Unverified | 0 |
| V-PROM: A Benchmark for Visual Reasoning Using Visual Progressive Matrices | Jul 29, 2019 | Visual Reasoning | —Unverified | 0 |
| VQA and Visual Reasoning: An Overview of Recent Datasets, Methods and Challenges | Dec 26, 2022 | Representation LearningVisual Question Answering (VQA) | —Unverified | 0 |
| Weakly Supervised Semantic Parsing with Abstract Examples | Jul 1, 2018 | Semantic ParsingVisual Reasoning | —Unverified | 0 |
| Webly Supervised Knowledge Embedding Model for Visual Reasoning | Jun 1, 2020 | modelRepresentation Learning | —Unverified | 0 |
| What Makes a Maze Look Like a Maze? | Sep 12, 2024 | Visual Reasoning | —Unverified | 0 |
| Whiteboard-of-Thought: Thinking Step-by-Step Across Modalities | Jun 20, 2024 | Spatial ReasoningVisual Reasoning | —Unverified | 0 |
| World-aware Planning Narratives Enhance Large Vision-Language Model Planner | Jun 26, 2025 | Imitation LearningLanguage Modeling | —Unverified | 0 |
| Wu's Method can Boost Symbolic AI to Rival Silver Medalists and AlphaGeometry to Outperform Gold Medalists at IMO Geometry | Apr 9, 2024 | Automated Theorem ProvingCPU | —Unverified | 0 |
| X-Former: Unifying Contrastive and Reconstruction Learning for MLLMs | Jul 18, 2024 | Contrastive LearningRepresentation Learning | —Unverified | 0 |
| ZeroBench: An Impossible Visual Benchmark for Contemporary Large Multimodal Models | Feb 13, 2025 | Visual Reasoning | —Unverified | 0 |
| Zero-Shot Visual Reasoning by Vision-Language Models: Benchmarking and Analysis | Aug 27, 2024 | BenchmarkingLarge Language Model | —Unverified | 0 |
| ExoViP: Step-by-step Verification and Exploration with Exoskeleton Modules for Compositional Visual Reasoning | Aug 5, 2024 | Visual Reasoning | —Unverified | 0 |
| Deconfounded Visual Grounding | Dec 31, 2021 | Referring ExpressionVisual Grounding | CodeCode Available | 0 |
| Visual Reasoning in Object-Centric Deep Neural Networks: A Comparative Cognition Approach | Feb 20, 2024 | ObjectRelational Reasoning | CodeCode Available | 0 |
| Learning from Lexical Perturbations for Consistent Visual Question Answering | Nov 26, 2020 | Question AnsweringVisual Question Answering | CodeCode Available | 0 |
| UniT: Multimodal Multitask Learning with a Unified Transformer | Feb 22, 2021 | DecoderMultimodal Reasoning | CodeCode Available | 0 |
| Beyond the Doors of Perception: Vision Transformers Represent Relations Between Objects | Jun 22, 2024 | Relational ReasoningVisual Reasoning | CodeCode Available | 0 |
| Visual Reasoning with Multi-hop Feature Modulation | Aug 3, 2018 | Question AnsweringVisual Dialog | CodeCode Available | 0 |
| Transparency by Design: Closing the Gap Between Performance and Interpretability in Visual Reasoning | Mar 14, 2018 | Question AnsweringVisual Question Answering | CodeCode Available | 0 |
| What Is Missing in Multilingual Visual Reasoning and How to Fix It | Mar 3, 2024 | Image CaptioningVisual Reasoning | CodeCode Available | 0 |
| ControlThinker: Unveiling Latent Semantics for Controllable Image Generation through Visual Reasoning | Jun 4, 2025 | Image GenerationVisual Reasoning | CodeCode Available | 0 |
| Contextual Modulation for Relation-Level Metaphor Identification | Oct 12, 2020 | RelationVisual Reasoning | CodeCode Available | 0 |
| Complete 3D Scene Parsing from an RGBD Image | Oct 25, 2017 | DiversityRetrieval | CodeCode Available | 0 |
| Learning Dynamics of Attention: Human Prior for Interpretable Machine Reasoning | May 28, 2019 | Visual Reasoning | CodeCode Available | 0 |
| What is the Visual Cognition Gap between Humans and Multimodal LLMs? | Jun 14, 2024 | object-detectionObject Detection | CodeCode Available | 0 |
| A Dataset and Architecture for Visual Reasoning with a Working Memory | Mar 16, 2018 | DiagnosticLogical Reasoning | CodeCode Available | 0 |
| Learning by Abstraction: The Neural State Machine | Jul 9, 2019 | Visual Question Answering (VQA)Visual Reasoning | CodeCode Available | 0 |
| Unicode Analogies: An Anti-Objectivist Visual Reasoning Challenge | Jan 1, 2023 | NavigateVisual Reasoning | CodeCode Available | 0 |
| OFA: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework | Feb 7, 2022 | Image Captioningimage-classification | CodeCode Available | 0 |
| Visual Transformation Telling | May 3, 2023 | Dense Video CaptioningVideo Captioning | CodeCode Available | 0 |
| Learning Abstract Visual Reasoning via Task Decomposition: A Case Study in Raven Progressive Matrices | Aug 12, 2023 | Visual Reasoning | CodeCode Available | 0 |
| KnowZRel: Common Sense Knowledge-based Zero-Shot Relationship Retrieval for Generalised Scene Graph Generation | Feb 21, 2025 | Common Sense ReasoningGraph Generation | CodeCode Available | 0 |
| Unraveling the geometry of visual relational reasoning | Feb 24, 2025 | Relational ReasoningRelation Network | CodeCode Available | 0 |
| Beyond Perception: Evaluating Abstract Visual Reasoning through Multi-Stage Task | May 28, 2025 | Visual Reasoning | CodeCode Available | 0 |
| 'Just because you are right, doesn't mean I am wrong': Overcoming a Bottleneck in the Development and Evaluation of Open-Ended Visual Question Answering (VQA) Tasks | Mar 28, 2021 | Question AnsweringVisual Question Answering | CodeCode Available | 0 |
| A Plug-and-Play Method for Rare Human-Object Interactions Detection by Bridging Domain Gap | Jul 31, 2024 | Human-Object Interaction DetectionImage Reconstruction | CodeCode Available | 0 |
| VASR: Visual Analogies of Situation Recognition | Dec 8, 2022 | Common Sense ReasoningTriplet | CodeCode Available | 0 |
| JourneyBench: A Challenging One-Stop Vision-Language Understanding Benchmark of Generated Images | Sep 19, 2024 | HallucinationImage Captioning | CodeCode Available | 0 |
| VDebugger: Harnessing Execution Feedback for Debugging Visual Programs | Jun 19, 2024 | Visual Reasoning | CodeCode Available | 0 |
| Joint Answering and Explanation for Visual Commonsense Reasoning | Feb 25, 2022 | Knowledge DistillationQuestion Answering | CodeCode Available | 0 |