SOTAVerified

Spatial Reasoning

Papers

Showing 401450 of 453 papers

TitleStatusHype
Weakly Supervised Relative Spatial Reasoning for Visual Question AnsweringCode0
Guided Navigation from Multiple Viewpoints using Qualitative Spatial ReasoningCode0
Grounding Spatial Relations in Text-Only Language ModelsCode0
Polymath: A Challenging Multi-modal Mathematical Reasoning BenchmarkCode0
OST-Bench: Evaluating the Capabilities of MLLMs in Online Spatio-temporal Scene UnderstandingCode0
Grounding Natural Language Instructions: Can Large Language Models Capture Spatial Information?Code0
No Blind Spots: Full-Surround Multi-Object Tracking for Autonomous Vehicles using Cameras & LiDARsCode0
Neuro-symbolic Training for Reasoning over Spatial LanguageCode0
SORNet: Spatial Object-Centric Representations for Sequential ManipulationCode0
Neural Task Synthesis for Visual ProgrammingCode0
SpaceNLI: Evaluating the Consistency of Predicting Inferences in SpaceCode0
SpaRC and SpaRP: Spatial Reasoning Characterization and Path Generation for Understanding Spatial Reasoning Capability of Large Language ModelsCode0
SPaRC: A Spatial Pathfinding Reasoning ChallengeCode0
Narrowing the Gap between Vision and Action in NavigationCode0
Grid-augmented vision: A simple yet effective approach for enhanced spatial understanding in multi-modal agentsCode0
MM-GEN: Enhancing Task Performance Through Targeted Multimodal Data CurationCode0
MapEval: A Map-Based Evaluation of Geo-Spatial Reasoning in Foundation ModelsCode0
EgoHumans: An Egocentric 3D Multi-Human BenchmarkCode0
Good at captioning, bad at counting: Benchmarking GPT-4V on Earth observation dataCode0
LOViS: Learning Orientation and Visual Signals for Vision and Language NavigationCode0
Disentangling Extraction and Reasoning in Multi-hop Spatial ReasoningCode0
Towards Ambiguity-Free Spatial Foundation Model: Rethinking and Decoupling Depth AmbiguityCode0
DepWiGNN: A Depth-wise Graph Neural Network for Multi-hop Spatial Reasoning in TextCode0
Can Large Language Models Reason about the Region Connection Calculus?Code0
Location Aware Modular Biencoder for Tourism Question AnsweringCode0
Spatial Memory for Context Reasoning in Object DetectionCode0
3D CoCa: Contrastive Learners are 3D CaptionersCode0
Are LLMs the Master of All Trades? : Exploring Domain-Agnostic Reasoning Skills of LLMsCode0
Learning to Localize Objects Improves Spatial Reasoning in Visual-LLMsCode0
Dense 2D-3D Indoor Prediction with Sound via Aligned Cross-Modal DistillationCode0
From Text to Space: Mapping Abstract Spatial Models in LLMs during a Grid-World Navigation TaskCode0
FoREST: Frame of Reference Evaluation in Spatial Reasoning TasksCode0
Translating Place-Related Questions to GeoSPARQL QueriesCode0
DeepSSN: a deep convolutional neural network to assess spatial scene similarityCode0
APT: Architectural Planning and Text-to-Blueprint Construction Using Large Language Models for Open-World AgentsCode0
FloorNet: A Unified Framework for Floorplan Reconstruction from 3D ScansCode0
Learning from Videos for 3D World: Enhancing MLLMs with 3D Vision Geometry PriorsCode0
Knowing Earlier what Right Means to You: A Comprehensive VQA Dataset for Grounding Relative Directions via Multi-Task LearningCode0
Bridging the Dynamic Perception Gap: Training-Free Draft Chain-of-Thought for Dynamic Multimodal Spatial ReasoningCode0
Explicit Object Relation Alignment for Vision and Language NavigationCode0
SPHERE: A Hierarchical Evaluation on Spatial Perception and Reasoning for Vision-Language ModelsCode0
SPhyR: Spatial-Physical Reasoning Benchmark on Material DistributionCode0
Expand VSR Benchmark for VLLM to Expertize in Spatial RulesCode0
CoSpace: Benchmarking Continuous Space Perception Ability for Vision-Language ModelsCode0
Evaluation of Code LLMs on Geospatial Code GenerationCode0
STAR-R1: Spacial TrAnsformation Reasoning by Reinforcing Multimodal LLMsCode0
Enhancing Logical Reasoning in Large Language Models through Graph-based Synthetic DataCode0
Investigating Relational State Abstraction in Collaborative MARLCode0
Encoding Spatial Relations from Natural LanguageCode0
cilantro: A Lean, Versatile, and Efficient Library for Point Cloud Data ProcessingCode0
Show:102550
← PrevPage 9 of 10Next →

No leaderboard results yet.