| Vision-Integrated LLMs for Autonomous Driving Assistance : Human Performance Comparison and Trust Evaluation | Feb 6, 2025 | Autonomous DrivingDecision Making | —Unverified | 0 |
| A Schema-Guided Reason-while-Retrieve framework for Reasoning on Scene Graphs with Large-Language-Models (LLMs) | Feb 5, 2025 | HallucinationSpatial Reasoning | —Unverified | 0 |
| Exploring Spatial Language Grounding Through Referring Expressions | Feb 4, 2025 | Image CaptioningNegation | —Unverified | 0 |
| Spatial-RAG: Spatial Retrieval Augmented Generation for Real-World Geospatial Reasoning Questions | Feb 4, 2025 | Question AnsweringRAG | —Unverified | 0 |
| VL-Nav: Real-time Vision-Language Navigation with Spatial Reasoning | Feb 2, 2025 | Spatial ReasoningVision-Language Navigation | —Unverified | 0 |
| RLS3: RL-Based Synthetic Sample Selection to Enhance Spatial Reasoning in Vision-Language Models for Indoor Autonomous Perception | Jan 31, 2025 | Reinforcement Learning (RL)Spatial Reasoning | —Unverified | 0 |
| 3D-MoE: A Mixture-of-Experts Multi-modal LLM for 3D Vision and Pose Diffusion via Rectified Flow | Jan 28, 2025 | Instruction FollowingMixture-of-Experts | —Unverified | 0 |
| Bridging Visualization and Optimization: Multimodal Large Language Models on Graph-Structured Combinatorial Optimization | Jan 21, 2025 | Combinatorial OptimizationSequential Decision Making | —Unverified | 0 |
| SpatialCoT: Advancing Spatial Reasoning through Coordinate Alignment and Chain-of-Thought for Embodied Task Planning | Jan 17, 2025 | Spatial ReasoningTask Planning | —Unverified | 0 |
| Embodied Scene Understanding for Vision Language Models via MetaVQA | Jan 15, 2025 | Decision MakingQuestion Answering | —Unverified | 0 |
| MM-GEN: Enhancing Task Performance Through Targeted Multimodal Data Curation | Jan 7, 2025 | Spatial Reasoning | CodeCode Available | 0 |
| AuxDepthNet: Real-Time Monocular 3D Object Detection with Depth-Sensitive Features | Jan 7, 2025 | 3D Object DetectionComputational Efficiency | —Unverified | 0 |
| SpatialCLIP: Learning 3D-aware Image Representations from Spatially Discriminative Language | Jan 1, 2025 | Spatial Reasoning | —Unverified | 0 |
| SKE-Layout: Spatial Knowledge Enhanced Layout Generation with LLMs | Jan 1, 2025 | Contrastive LearningImage Generation | —Unverified | 0 |
| Chain of Semantics Programming in 3D Gaussian Splatting Representation for 3D Vision Grounding | Jan 1, 2025 | 3DGSLarge Language Model | —Unverified | 0 |
| R2C: Mapping Room to Chessboard to Unlock LLM As Low-Level Action Planner | Jan 1, 2025 | Action GenerationGame of Chess | —Unverified | 0 |
| Spatial457: A Diagnostic Benchmark for 6D Spatial Reasoning of Large Mutimodal Models | Jan 1, 2025 | AttributeDiagnostic | —Unverified | 0 |
| MapEval: A Map-Based Evaluation of Geo-Spatial Reasoning in Foundation Models | Dec 31, 2024 | Multiple-choiceQuestion Answering | CodeCode Available | 0 |
| CAD-GPT: Synthesising CAD Construction Sequence with Spatial Reasoning-Enhanced Multimodal LLMs | Dec 27, 2024 | Spatial Reasoning | —Unverified | 0 |
| Expand VSR Benchmark for VLLM to Expertize in Spatial Rules | Dec 24, 2024 | MMESensitivity | CodeCode Available | 0 |
| Path-of-Thoughts: Extracting and Following Paths for Robust Relational Reasoning with Large Language Models | Dec 23, 2024 | Relational ReasoningSpatial Reasoning | —Unverified | 0 |
| Do Multimodal Language Models Really Understand Direction? A Benchmark for Compass Direction Reasoning | Dec 21, 2024 | Spatial Reasoning | —Unverified | 0 |
| Investigating Relational State Abstraction in Collaborative MARL | Dec 19, 2024 | Graph Neural NetworkMulti-agent Reinforcement Learning | CodeCode Available | 0 |
| Mathematical Definition and Systematization of Puzzle Rules | Dec 18, 2024 | Game DesignSpatial Reasoning | —Unverified | 0 |
| SPHERE: A Hierarchical Evaluation on Spatial Perception and Reasoning for Vision-Language Models | Dec 17, 2024 | Logical ReasoningSpatial Reasoning | CodeCode Available | 0 |