SOTAVerified

Spatial Reasoning

Papers

Showing 201250 of 453 papers

TitleStatusHype
Bridging the Dynamic Perception Gap: Training-Free Draft Chain-of-Thought for Dynamic Multimodal Spatial ReasoningCode0
Spatial-MLLM: Boosting MLLM Capabilities in Visual-based Spatial Intelligence0
SpatialPrompting: Keyframe-driven Zero-Shot Spatial Reasoning with Off-the-Shelf Multimodal Large Language Models0
Spatial-RAG: Spatial Retrieval Augmented Generation for Real-World Geospatial Reasoning Questions0
Spatial Reasoner: A 3D Inference Pipeline for XR Applications0
SpatialReasoner: Towards Explicit and Generalizable 3D Spatial Reasoning0
Spatial Reasoning and Planning for Deep Embodied Agents0
Spatial Reasoning for Few-Shot Object Detection0
Spatial Reasoning from Natural Language Instructions for Robot Manipulation0
Spatial Symmetry Driven Pruning Strategies for Efficient Declarative Spatial Reasoning0
SpatialVLM: Endowing Vision-Language Models with Spatial Reasoning Capabilities0
SSR: Enhancing Depth Perception in Vision-Language Models via Rationale-Guided Spatial Reasoning0
Stacked Latent Attention for Multimodal Reasoning0
StarCraftImage: A Dataset For Prototyping Spatial Reasoning Methods For Multi-Agent Environments0
Statistical applications of the 20/60/20 rule in risk management and portfolio optimization0
STMA: A Spatio-Temporal Memory Agent for Long-Horizon Embodied Task Planning0
Stride and Translation Invariance in CNNs0
Structured Spatial Reasoning with Open Vocabulary Object Detectors0
ST-VLM: Kinematic Instruction Tuning for Spatio-Temporal Reasoning in Vision-Language Models0
Tag Map: A Text-Based Map for Spatial Reasoning and Navigation with Large Language Models0
Talking about the Moving Image: A Declarative Model for Image Schema Based Embodied Perception Grounding and Language Generation0
Testing GPT-4-o1-preview on math and science problems: A follow-up study0
TopV-Nav: Unlocking the Top-View Spatial Reasoning Potential of MLLM for Zero-shot Object Navigation0
Toward 3D Spatial Reasoning for Human-like Text-based Visual Question Answering0
Towards Dynamic 3D Reconstruction of Hand-Instrument Interaction in Ophthalmic Surgery0
Towards Embodied Cognition in Robots via Spatially Grounded Synthetic Worlds0
Towards Grounded Visual Spatial Reasoning in Multi-Modal Vision Language Models0
Towards Navigation by Reasoning over Spatial Configurations0
Towards Visual Text Grounding of Multimodal Large Language Model0
U2-BENCH: Benchmarking Large Vision-Language Models on Ultrasound Understanding0
UI-Vision: A Desktop-centric GUI Benchmark for Visual Perception and Interaction0
Unifying Map and Landmark Based Representations for Visual Navigation0
Unsupervised Representation Learning Facilitates Human-like Spatial Reasoning0
Video Perception Models for 3D Scene Synthesis0
VideoSAVi: Self-Aligned Video Language Models without Human Supervision0
VisionArena: 230K Real World User-VLM Conversations with Preference Labels0
Vision-Integrated LLMs for Autonomous Driving Assistance : Human Performance Comparison and Trust Evaluation0
Visual Agentic AI for Spatial Reasoning with a Dynamic API0
VisualEchoes: Spatial Image Representation Learning through Echolocation0
Visual Embodied Brain: Let Multimodal Large Language Models See, Think, and Control in Spaces0
Visual-Linguistic Agent: Towards Collaborative Contextual Object Reasoning0
VisualPuzzles: Decoupling Multimodal Reasoning Evaluation from Domain Knowledge0
VisuoThink: Empowering LVLM Reasoning with Multimodal Tree Search0
ViTaPEs: Visuotactile Position Encodings for Cross-Modal Alignment in Multimodal Transformers0
VLM Can Be a Good Assistant: Enhancing Embodied Visual Tracking with Self-Improving Vision-Language Models0
VLM-R^3: Region Recognition, Reasoning, and Refinement for Enhanced Multimodal Chain-of-Thought0
VL-Nav: Real-time Vision-Language Navigation with Spatial Reasoning0
What is needed for simple spatial language capabilities in VQA?0
Where Am I and What Will I See: An Auto-Regressive Model for Spatial Localization and View Prediction0
Whiteboard-of-Thought: Thinking Step-by-Step Across Modalities0
Show:102550
← PrevPage 5 of 10Next →

No leaderboard results yet.