SOTAVerified

Spatial Reasoning

Papers

Showing 101125 of 453 papers

TitleStatusHype
StepGame: A New Benchmark for Robust Multi-Hop Spatial Reasoning in TextsCode1
HSPFormer: Hierarchical Spatial Perception Transformer for Semantic SegmentationCode1
Learning and Reasoning with the Graph Structure Representation in Robotic SurgeryCode1
Learning Action and Reasoning-Centric Image Editing from Videos and SimulationsCode1
SPARTQA: A Textual Question Answering Benchmark for Spatial ReasoningCode1
CityEQA: A Hierarchical LLM Agent on Embodied Question Answering Benchmark in City SpaceCode1
Geospatial Mechanistic Interpretability of Large Language ModelsCode1
SmartPlay: A Benchmark for LLMs as Intelligent AgentsCode1
Logic-RAG: Augmenting Large Multimodal Models with Visual-Spatial Knowledge for Road Scene UnderstandingCode1
Do Vision-Language Models Represent Space and How? Evaluating Spatial Frame of Reference Under AmbiguitiesCode1
From Seeing to Doing: Bridging Reasoning and Decision for Robotic ManipulationCode1
SE-KGE: A Location-Aware Knowledge Graph Embedding Model for Geographic Question Answering and Spatial Semantic LiftingCode1
Capturing Shape Information with Multi-Scale Topological Loss Terms for 3D ReconstructionCode1
MineAnyBuild: Benchmarking Spatial Planning for Open-world AI AgentsCode1
BiST: Bi-directional Spatio-Temporal Reasoning for Video-Grounded DialoguesCode1
Multi-scale GCN-assisted two-stage network for joint segmentation of retinal layers and disc in peripapillary OCT imagesCode1
SmartFreeEdit: Mask-Free Spatial-Aware Image Editing with Complex Instruction UnderstandingCode1
Are Deep Neural Networks SMARTer than Second Graders?Code1
SBEVNet: End-to-End Deep Stereo Layout EstimationCode1
Enhancing Reasoning to Adapt Large Language Models for Domain-Specific ApplicationsCode1
ROCKET-1: Mastering Open-World Interaction with Visual-Temporal Context PromptingCode1
Seeing is Not Reasoning: MVPBench for Graph-based Evaluation of Multi-path Visual Physical CoTCode1
ReCLIP: A Strong Zero-Shot Baseline for Referring Expression ComprehensionCode1
OpenKD: Opening Prompt Diversity for Zero- and Few-shot Keypoint DetectionCode1
Can Large Language Models be Good Path Planners? A Benchmark and Investigation on Spatial-temporal ReasoningCode1
Show:102550
← PrevPage 5 of 19Next →

No leaderboard results yet.