SOTAVerified

Spatial Reasoning

Papers

Showing 301350 of 453 papers

TitleStatusHype
CAD-GPT: Synthesising CAD Construction Sequence with Spatial Reasoning-Enhanced Multimodal LLMs0
Can Large Language Models Create New Knowledge for Spatial Reasoning Tasks?0
Can Large Multimodal Models Understand Agricultural Scenes? Benchmarking with AgroMind0
Can LLM be a Good Path Planner based on Prompt Engineering? Mitigating the Hallucination for Path Planning0
Can MLLMs Guide Me Home? A Benchmark Study on Fine-Grained Visual Reasoning from Transit Maps0
CASPER: Cognitive Architecture for Social Perception and Engagement in Robots0
Chain of Semantics Programming in 3D Gaussian Splatting Representation for 3D Vision Grounding0
Challenge of Spatial Cognition for Deep Learning0
Challenges Faced by Large Language Models in Solving Multi-Agent Flocking0
CleverDistiller: Simple and Spatially Consistent Cross-modal Distillation0
Cog-GA: A Large Language Models-based Generative Agent for Vision-Language Navigation in Continuous Environments0
Combining Deep Learning and Qualitative Spatial Reasoning to Learn Complex Structures from Sparse Examples with Noise0
Commonsense Spatial Reasoning for Visually Intelligent Agents0
Commonsense Visual Sensemaking for Autonomous Driving: On Generalised Neurosymbolic Online Abduction Integrating Vision and Semantics0
Complexity Classification in Infinite-Domain Constraint Satisfaction0
Contextual Reasoning for Scene Generation (Technical Report)0
Contrastive Region Guidance: Improving Grounding in Vision-Language Models without Training0
Controllable Text-to-Image Generation with GPT-40
DARE: Diverse Visual Question Answering with Robustness Evaluation0
DataPlatter: Boosting Robotic Manipulation Generalization with Minimal Costly Data0
Dialectical language model evaluation: An initial appraisal of the commonsense spatial reasoning abilities of LLMs0
Direct Numerical Layout Generation for 3D Indoor Scene Synthesis via Spatial Reasoning0
Distortions in Judged Spatial Relations in Large Language Models0
DivCon: Divide and Conquer for Progressive Text-to-Image Generation0
Do Multimodal Language Models Really Understand Direction? A Benchmark for Compass Direction Reasoning0
DriveVLM: The Convergence of Autonomous Driving and Large Vision-Language Models0
Navigating Motion Agents in Dynamic and Cluttered Environments through LLM Reasoning0
EarthGPT-X: Enabling MLLMs to Flexibly and Comprehensively Understand Multi-Source Remote Sensing Imagery0
Ego-Centric Spatial Memory Networks0
Ego-Humans: An Ego-Centric 3D Multi-Human Benchmark0
Embodied Chain of Action Reasoning with Multi-Modal Foundation Model for Humanoid Loco-manipulation0
Embodied Scene Understanding for Vision Language Models via MetaVQA0
EmbodiedVSR: Dynamic Scene Graph-Guided Chain-of-Thought Reasoning for Visual Spatial Tasks0
Embodied World Models Emerge from Navigational Task in Open-Ended Environments0
EmbRACE-3K: Embodied Reasoning and Action in Complex Environments0
Endowing Embodied Agents with Spatial Reasoning Capabilities for Vision-and-Language Navigation0
Evaluating Robustness of Visual Representations for Object Assembly Task Requiring Spatio-Geometrical Reasoning0
Explicit Object Relation Alignment for Vision and Language Navigation0
Exploring and Improving the Spatial Reasoning Abilities of Large Language Models0
Exploring Spatial Language Grounding Through Referring Expressions0
Exploring The Spatial Reasoning Ability of Neural Models in Human IQ Tests0
Fine-grained Qualitative Spatial Reasoning about Point Positions0
First Order Logic with Fuzzy Semantics for Describing and Recognizing Nerves in Medical Images0
FlowVQA: Mapping Multimodal Logic in Visual Question Answering with Flowcharts0
FollowEval: A Multi-Dimensional Benchmark for Assessing the Instruction-Following Capability of Large Language Models0
Following Instructions by Imagining and Reaching Visual Goals0
Foundation Models for Remote Sensing: An Analysis of MLLMs for Object Localization0
FreeInsert: Disentangled Text-Guided Object Insertion in 3D Gaussian Scene without Spatial Priors0
From 2D to 3D Cognition: A Brief Survey of General World Models0
From Objects to Anywhere: A Holistic Benchmark for Multi-level Visual Grounding in 3D Scenes0
Show:102550
← PrevPage 7 of 10Next →

No leaderboard results yet.