SOTAVerified

Spatial Reasoning

Papers

Showing 251300 of 453 papers

TitleStatusHype
Vision-Integrated LLMs for Autonomous Driving Assistance : Human Performance Comparison and Trust Evaluation0
A Schema-Guided Reason-while-Retrieve framework for Reasoning on Scene Graphs with Large-Language-Models (LLMs)0
Exploring Spatial Language Grounding Through Referring Expressions0
Spatial-RAG: Spatial Retrieval Augmented Generation for Real-World Geospatial Reasoning Questions0
VL-Nav: Real-time Vision-Language Navigation with Spatial Reasoning0
RLS3: RL-Based Synthetic Sample Selection to Enhance Spatial Reasoning in Vision-Language Models for Indoor Autonomous Perception0
3D-MoE: A Mixture-of-Experts Multi-modal LLM for 3D Vision and Pose Diffusion via Rectified Flow0
Bridging Visualization and Optimization: Multimodal Large Language Models on Graph-Structured Combinatorial Optimization0
SpatialCoT: Advancing Spatial Reasoning through Coordinate Alignment and Chain-of-Thought for Embodied Task Planning0
Embodied Scene Understanding for Vision Language Models via MetaVQA0
MM-GEN: Enhancing Task Performance Through Targeted Multimodal Data CurationCode0
AuxDepthNet: Real-Time Monocular 3D Object Detection with Depth-Sensitive Features0
SpatialCLIP: Learning 3D-aware Image Representations from Spatially Discriminative Language0
SKE-Layout: Spatial Knowledge Enhanced Layout Generation with LLMs0
Chain of Semantics Programming in 3D Gaussian Splatting Representation for 3D Vision Grounding0
R2C: Mapping Room to Chessboard to Unlock LLM As Low-Level Action Planner0
Spatial457: A Diagnostic Benchmark for 6D Spatial Reasoning of Large Mutimodal Models0
MapEval: A Map-Based Evaluation of Geo-Spatial Reasoning in Foundation ModelsCode0
CAD-GPT: Synthesising CAD Construction Sequence with Spatial Reasoning-Enhanced Multimodal LLMs0
Expand VSR Benchmark for VLLM to Expertize in Spatial RulesCode0
Path-of-Thoughts: Extracting and Following Paths for Robust Relational Reasoning with Large Language Models0
Do Multimodal Language Models Really Understand Direction? A Benchmark for Compass Direction Reasoning0
Investigating Relational State Abstraction in Collaborative MARLCode0
Mathematical Definition and Systematization of Puzzle Rules0
SPHERE: A Hierarchical Evaluation on Spatial Perception and Reasoning for Vision-Language ModelsCode0
A dual contrastive framework0
Geo-LLaVA: A Large Multi-Modal Model for Solving Geometry Math Problems with Meta In-Context Learning0
VisionArena: 230K Real World User-VLM Conversations with Preference Labels0
3DSRBench: A Comprehensive 3D Spatial Reasoning Benchmark0
VideoSAVi: Self-Aligned Video Language Models without Human Supervision0
Can Large Language Models Reason about the Region Connection Calculus?Code0
Grid-augmented vision: A simple yet effective approach for enhanced spatial understanding in multi-modal agentsCode0
Dspy-based Neural-Symbolic Pipeline to Enhance Spatial Reasoning in LLMs0
APT: Architectural Planning and Text-to-Blueprint Construction Using Large Language Models for Open-World AgentsCode0
RoboSpatial: Teaching Spatial Understanding to 2D and 3D Vision-Language Models for Robotics0
TopV-Nav: Unlocking the Top-View Spatial Reasoning Potential of MLLM for Zero-shot Object Navigation0
BALROG: Benchmarking Agentic LLM and VLM Reasoning On Games0
Visual-Linguistic Agent: Towards Collaborative Contextual Object Reasoning0
Architect: Generating Vivid and Interactive 3D Scenes with Hierarchical 2D Inpainting0
AI's Spatial Intelligence: Evaluating AI's Understanding of Spatial Transformations in PSVT:R and Augmented Reality0
GPT-4o System Card0
Where Am I and What Will I See: An Auto-Regressive Model for Spatial Localization and View Prediction0
Geometric Feature Enhanced Knowledge Graph Embedding and Spatial Reasoning0
Sparkle: Mastering Basic Spatial Capabilities in Vision Language Models Elicits Generalization to Composite Spatial Reasoning0
Aerial Vision-and-Language Navigation via Semantic-Topo-Metric Representation Guided LLM Reasoning0
Testing GPT-4-o1-preview on math and science problems: A follow-up study0
Structured Spatial Reasoning with Open Vocabulary Object Detectors0
Evaluation of Code LLMs on Geospatial Code GenerationCode0
Polymath: A Challenging Multi-modal Mathematical Reasoning BenchmarkCode0
SPARTUN3D: Situated Spatial Understanding of 3D World in Large Language Models0
Show:102550
← PrevPage 6 of 10Next →

No leaderboard results yet.