SOTAVerified

Spatial Reasoning

Papers

Showing 126150 of 453 papers

TitleStatusHype
Spatially Aware Multimodal Transformers for TextVQACode1
Learning and Reasoning with the Graph Structure Representation in Robotic SurgeryCode1
SE-KGE: A Location-Aware Knowledge Graph Embedding Model for Geographic Question Answering and Spatial Semantic LiftingCode1
SPARE3D: A Dataset for SPAtial REasoning on Three-View Line DrawingsCode1
Pix2Shape: Towards Unsupervised Learning of 3D Scenes from Images using a View-based RepresentationCode1
VSGNet: Spatial Attention Network for Detecting Human Object Interactions Using Graph ConvolutionsCode1
SpatialSense: An Adversarially Crowdsourced Benchmark for Spatial Relation RecognitionCode1
Touchdown: Natural Language Navigation and Spatial Reasoning in Visual Street EnvironmentsCode1
GuessWhat?! Visual object discovery through multi-modal dialogueCode1
MindJourney: Test-Time Scaling with World Models for Spatial Reasoning0
EmbRACE-3K: Embodied Reasoning and Action in Complex Environments0
ByDeWay: Boost Your multimodal LLM with DEpth prompting in a Training-Free Way0
M2-Reasoning: Empowering MLLMs with Unified General and Spatial Reasoning0
Scaling RL to Long Videos0
OST-Bench: Evaluating the Capabilities of MLLMs in Online Spatio-temporal Scene UnderstandingCode0
A Neural Representation Framework with LLM-Driven Spatial Reasoning for Open-Vocabulary 3D Visual Grounding0
Optimising Language Models for Downstream Tasks: A Post-Training Perspective0
ImplicitQA: Going beyond frames towards Implicit Video ReasoningCode0
World-aware Planning Narratives Enhance Large Vision-Language Model Planner0
ShotBench: Expert-Level Cinematic Understanding in Vision-Language Models0
From 2D to 3D Cognition: A Brief Survey of General World Models0
Video Perception Models for 3D Scene Synthesis0
ImmerseGen: Agent-Guided Immersive World Generation with Alpha-Textured Proxies0
SIRI-Bench: Challenging VLMs' Spatial Intelligence through Complex Reasoning Tasks0
PeRL: Permutation-Enhanced Reinforcement Learning for Interleaved Vision-Language Reasoning0
Show:102550
← PrevPage 6 of 19Next →

No leaderboard results yet.