SOTAVerified

Spatial Reasoning

Papers

Showing 101125 of 453 papers

TitleStatusHype
Enhancing Reasoning to Adapt Large Language Models for Domain-Specific ApplicationsCode1
Joint Spatio-Textual Reasoning for Answering Tourism QuestionsCode1
Knot So Simple: A Minimalistic Environment for Spatial ReasoningCode1
SmartFreeEdit: Mask-Free Spatial-Aware Image Editing with Complex Instruction UnderstandingCode1
CityNavAgent: Aerial Vision-and-Language Navigation with Hierarchical Semantic Planning and Global MemoryCode1
CityGPT: Empowering Urban Spatial Cognition of Large Language ModelsCode1
Learning Action and Reasoning-Centric Image Editing from Videos and SimulationsCode1
SpartQA: : A Textual Question Answering Benchmark for Spatial ReasoningCode1
CityEQA: A Hierarchical LLM Agent on Embodied Question Answering Benchmark in City SpaceCode1
Do Vision-Language Models Represent Space and How? Evaluating Spatial Frame of Reference Under AmbiguitiesCode1
ING-VP: MLLMs cannot Play Easy Vision-based Games YetCode1
Improved Visual-Spatial Reasoning via R1-Zero-Like TrainingCode1
DropPos: Pre-Training Vision Transformers by Reconstructing Dropped PositionsCode1
GuessWhat?! Visual object discovery through multi-modal dialogueCode1
Capturing Shape Information with Multi-Scale Topological Loss Terms for 3D ReconstructionCode1
HSPFormer: Hierarchical Spatial Perception Transformer for Semantic SegmentationCode1
IndoNLI: A Natural Language Inference Dataset for IndonesianCode1
Are Deep Neural Networks SMARTer than Second Graders?Code1
Geospatial Mechanistic Interpretability of Large Language ModelsCode1
From Seeing to Doing: Bridging Reasoning and Decision for Robotic ManipulationCode1
Grounded Chain-of-Thought for Multimodal Large Language ModelsCode1
Unfolding Spatial Cognition: Evaluating Multimodal Models on Visual SimulationsCode1
Unsupervised Visual Chain-of-Thought Reasoning via Preference OptimizationCode1
VideoCAD: A Large-Scale Video Dataset for Learning UI Interactions and 3D Reasoning from CAD SoftwareCode1
Can Large Language Models be Good Path Planners? A Benchmark and Investigation on Spatial-temporal ReasoningCode1
Show:102550
← PrevPage 5 of 19Next →

No leaderboard results yet.