SOTAVerified

Spatial Reasoning

Papers

Showing 376400 of 453 papers

TitleStatusHype
VisuoThink: Empowering LVLM Reasoning with Multimodal Tree Search0
ViTaPEs: Visuotactile Position Encodings for Cross-Modal Alignment in Multimodal Transformers0
VLM Can Be a Good Assistant: Enhancing Embodied Visual Tracking with Self-Improving Vision-Language Models0
VLM-R^3: Region Recognition, Reasoning, and Refinement for Enhanced Multimodal Chain-of-Thought0
VL-Nav: Real-time Vision-Language Navigation with Spatial Reasoning0
What is needed for simple spatial language capabilities in VQA?0
Where Am I and What Will I See: An Auto-Regressive Model for Spatial Localization and View Prediction0
Whiteboard-of-Thought: Thinking Step-by-Step Across Modalities0
WildVision: Evaluating Vision-Language Models in the Wild with Human Preferences0
Coarse Correspondences Boost Spatial-Temporal Reasoning in Multimodal Language Model0
SEM: Enhancing Spatial Understanding for Robust Robot Manipulation0
ShotBench: Expert-Level Cinematic Understanding in Vision-Language Models0
SIRI-Bench: Challenging VLMs' Spatial Intelligence through Complex Reasoning Tasks0
SITE: towards Spatial Intelligence Thorough Evaluation0
Situational Grounding within Multimodal Simulations0
SKE-Layout: Spatial Knowledge Enhanced Layout Generation with LLMs0
Skeleton-Based Action Recognition with Spatial Reasoning and Temporal Stack Learning0
Representation Learning for Grounded Spatial ReasoningCode0
Reframing Spatial Reasoning Evaluation in Language Models: A Real-World Simulation Benchmark for Qualitative ReasoningCode0
In-the-wild Audio Spatialization with Flexible Text-guided LocalizationCode0
VisFactor: Benchmarking Fundamental Visual Cognition in Multimodal Large Language ModelsCode0
Show, Don't Tell: Evaluating Large Language Models Beyond Textual Understanding with ChildPlayCode0
Inherent limitations of LLMs regarding spatial informationCode0
A Trajectory Calculus for Qualitative Spatial Reasoning Using Answer Set ProgrammingCode0
ImplicitQA: Going beyond frames towards Implicit Video ReasoningCode0
Show:102550
← PrevPage 16 of 19Next →

No leaderboard results yet.