| Deep Neural Networks for Visual Reasoning | Sep 24, 2022 | Multimodal ReasoningVisual Reasoning | —Unverified | 0 | 0 |
| Deep Learning Methods for Abstract Visual Reasoning: A Survey on Raven's Progressive Matrices | Jan 28, 2022 | Visual Reasoning | —Unverified | 0 | 0 |
| Pixel Reasoner: Incentivizing Pixel-Space Reasoning with Curiosity-Driven Reinforcement Learning | May 21, 2025 | Reinforcement Learning (RL)Visual Reasoning | —Unverified | 0 | 0 |
| Plug-and-Play Grounding of Reasoning in Multimodal Large Language Models | Mar 28, 2024 | Instruction FollowingVisual Reasoning | —Unverified | 0 | 0 |
| Point-RFT: Improving Multimodal Reasoning with Visually Grounded Reinforcement Finetuning | May 26, 2025 | document understandingMultimodal Reasoning | —Unverified | 0 | 0 |
| Poisoned-MRAG: Knowledge Poisoning Attacks to Multimodal Retrieval Augmented Generation | Mar 8, 2025 | RAGRetrieval | —Unverified | 0 | 0 |
| Multimodal Analysis Of Google Bard And GPT-Vision: Experiments In Visual Reasoning | Aug 17, 2023 | Common Sense ReasoningOptical Character Recognition | —Unverified | 0 | 0 |
| Whiteboard-of-Thought: Thinking Step-by-Step Across Modalities | Jun 20, 2024 | Spatial ReasoningVisual Reasoning | —Unverified | 0 | 0 |
| Probing Inter-modality: Visual Parsing with Self-Attention for Vision-Language Pre-training | Jun 25, 2021 | Image-text RetrievalQuestion Answering | —Unverified | 0 | 0 |
| Probing Inter-modality: Visual Parsing with Self-Attention for Vision-and-Language Pre-training | May 21, 2021 | Question AnsweringRelation | —Unverified | 0 | 0 |
| Probing Visual Language Priors in VLMs | Dec 31, 2024 | Question AnsweringVisual Question Answering | —Unverified | 0 | 0 |
| Procedural Reasoning Networks for Understanding Multimodal Procedures | Sep 19, 2019 | Inductive BiasVisual Reasoning | —Unverified | 0 | 0 |
| Visual Reasoning of Feature Attribution with Deep Recurrent Neural Networks | Jan 17, 2019 | ClassificationGeneral Classification | —Unverified | 0 | 0 |
| Zero-shot visual reasoning through probabilistic analogical mapping | Sep 29, 2022 | Visual Reasoning | —Unverified | 0 | 0 |
| Visual Reasoning with Natural Language | Oct 2, 2017 | DescriptiveDiversity | —Unverified | 0 | 0 |
| Proposal-free One-stage Referring Expression via Grid-Word Cross-Attention | May 5, 2021 | Question AnsweringReferring Expression | —Unverified | 0 | 0 |
| PropTest: Automatic Property Testing for Improved Visual Programming | Mar 25, 2024 | Question AnsweringReferring Expression | —Unverified | 0 | 0 |
| ProReason: Multi-Modal Proactive Reasoning with Decoupled Eyesight and Wisdom | Oct 18, 2024 | Visual Reasoning | —Unverified | 0 | 0 |
| DAReN: A Collaborative Approach Towards Reasoning And Disentangling | Sep 27, 2021 | DisentanglementInductive Bias | —Unverified | 0 | 0 |
| Curriculum Learning for Compositional Visual Reasoning | Mar 27, 2023 | Question AnsweringVisual Question Answering | —Unverified | 0 | 0 |
| PTR: A Benchmark for Part-based Conceptual, Relational, and Physical Reasoning | Dec 9, 2021 | DiagnosticInstance Segmentation | —Unverified | 0 | 0 |
| Pyramid Coder: Hierarchical Code Generator for Compositional Visual Question Answering | Jul 30, 2024 | Code GenerationQuestion Answering | —Unverified | 0 | 0 |
| PyVision: Agentic Vision with Dynamic Tooling | Jul 10, 2025 | Visual Reasoning | —Unverified | 0 | 0 |
| Critical Features Tracking on Triangulated Irregular Networks by a Scale-Space Method | Sep 10, 2024 | Visual Reasoning | —Unverified | 0 | 0 |
| A Domain-Independent Agent Architecture for Adaptive Operation in Evolving Open Worlds | Jun 9, 2023 | MinecraftVisual Reasoning | —Unverified | 0 | 0 |
| Question Guided Modular Routing Networks for Visual Question Answering | Apr 17, 2019 | Question AnsweringVisual Question Answering | —Unverified | 0 | 0 |
| Co-VQA : Answering by Interactive Sub Question Sequence | Apr 2, 2022 | Question AnsweringVisual Question Answering | —Unverified | 0 | 0 |
| Co-VQA : Answering by Interactive Sub Question Sequence | Nov 16, 2021 | Question AnsweringVisual Question Answering | —Unverified | 0 | 0 |
| RAVEN: A Dataset for Relational and Analogical Visual rEasoNing | Mar 7, 2019 | Object RecognitionQuestion Answering | —Unverified | 0 | 0 |
| A Divide-Align-Conquer Strategy for Program Synthesis | Jan 8, 2023 | ARCInductive logic programming | —Unverified | 0 | 0 |
| RBench-V: A Primary Assessment for Visual Reasoning Models with Multi-modal Outputs | May 22, 2025 | Image ManipulationMath | —Unverified | 0 | 0 |
| Reason from Context with Self-supervised Learning | Nov 23, 2022 | ObjectObject Recognition | —Unverified | 0 | 0 |
| Reasoning Limitations of Multimodal Large Language Models. A case study of Bongard Problems | Nov 2, 2024 | SpecificityVisual Reasoning | —Unverified | 0 | 0 |
| Reasoning over Vision and Language: Exploring the Benefits of Supplemental Knowledge | Jan 15, 2021 | Question AnsweringVisual Question Answering (VQA) | —Unverified | 0 | 0 |
| Cops-Ref: A new Dataset and Task on Compositional Referring Expression Comprehension | Mar 1, 2020 | Referring ExpressionReferring Expression Comprehension | —Unverified | 0 | 0 |
| Recurrent Vision Transformer for Solving Visual Reasoning Problems | Nov 29, 2021 | Object DetectionVisual Reasoning | —Unverified | 0 | 0 |
| Continual learning on 3D point clouds with random compressed rehearsal | May 16, 2022 | Continual LearningVisual Reasoning | —Unverified | 0 | 0 |
| Compositional Law Parsing with Latent Random Functions | Sep 15, 2022 | PositionVisual Reasoning | —Unverified | 0 | 0 |
| Comparison Visual Instruction Tuning | Jun 13, 2024 | Instruction FollowingNovelty Detection | —Unverified | 0 | 0 |
| Comparing Visual Reasoning in Humans and AI | Apr 29, 2021 | SentenceVisual Reasoning | —Unverified | 0 | 0 |
| Replace-then-Perturb: Targeted Adversarial Attacks With Visual Reasoning for Vision-Language Models | Nov 1, 2024 | Adversarial AttackContrastive Learning | —Unverified | 0 | 0 |
| Rethinking Bottlenecks in Safety Fine-Tuning of Vision Language Models | Jan 30, 2025 | Instruction FollowingVisual Reasoning | —Unverified | 0 | 0 |
| A Cognitive Paradigm Approach to Probe the Perception-Reasoning Interface in VLMs | Jan 23, 2025 | DescriptiveDiagnostic | —Unverified | 0 | 0 |
| Retrieving and Highlighting Action with Spatiotemporal Reference | May 19, 2020 | Action RecognitionCross-Modal Retrieval | —Unverified | 0 | 0 |
| Data augmentation by morphological mixup for solving Raven's Progressive Matrices | Mar 9, 2021 | Data AugmentationVisual Reasoning | —Unverified | 0 | 0 |
| Revisiting MLLMs: An In-Depth Analysis of Image Classification Abilities | Dec 21, 2024 | AttributeClassification | —Unverified | 0 | 0 |
| Code Repair with LLMs gives an Exploration-Exploitation Tradeoff | May 26, 2024 | Code RepairLanguage Modeling | —Unverified | 0 | 0 |
| RGB-Th-Bench: A Dense benchmark for Visual-Thermal Understanding of Vision Language Models | Mar 25, 2025 | Image ComprehensionVisual Reasoning | —Unverified | 0 | 0 |
| Robust Visual Reasoning via Language Guided Neural Module Networks | Dec 1, 2021 | Question AnsweringReferring Expression | —Unverified | 0 | 0 |
| CoCoT: Contrastive Chain-of-Thought Prompting for Large Multimodal Models with Multiple Image Inputs | Jan 5, 2024 | Image ComprehensionImage to text | —Unverified | 0 | 0 |