| HSPFormer: Hierarchical Spatial Perception Transformer for Semantic Segmentation | Jan 16, 2025 | Depth EstimationMonocular Depth Estimation | CodeCode Available | 1 |
| VideoCAD: A Large-Scale Video Dataset for Learning UI Interactions and 3D Reasoning from CAD Software | May 30, 2025 | Question AnsweringSpatial Reasoning | CodeCode Available | 1 |
| Grounded Chain-of-Thought for Multimodal Large Language Models | Mar 17, 2025 | HallucinationSpatial Reasoning | CodeCode Available | 1 |
| Vision-Language Models are Zero-Shot Reward Models for Reinforcement Learning | Oct 19, 2023 | MuJoCoPrompt Engineering | CodeCode Available | 1 |
| Enhancing Reasoning to Adapt Large Language Models for Domain-Specific Applications | Feb 5, 2025 | In-Context LearningLanguage Modeling | CodeCode Available | 1 |
| Grounding Consistency: Distilling Spatial Common Sense for Precise Visual Relationship Detection | Jan 1, 2021 | Common Sense ReasoningGraph Generation | CodeCode Available | 1 |
| Geospatial Mechanistic Interpretability of Large Language Models | May 6, 2025 | Spatial Reasoning | CodeCode Available | 1 |
| Can Large Language Models be Good Path Planners? A Benchmark and Investigation on Spatial-temporal Reasoning | Oct 5, 2023 | NavigateSpatial Reasoning | CodeCode Available | 1 |
| End-to-End Egospheric Spatial Memory | Feb 15, 2021 | General Reinforcement LearningImitation Learning | CodeCode Available | 1 |
| From Seeing to Doing: Bridging Reasoning and Decision for Robotic Manipulation | May 13, 2025 | Robot ManipulationSpatial Reasoning | CodeCode Available | 1 |