| Enhancing Reasoning to Adapt Large Language Models for Domain-Specific Applications | Feb 5, 2025 | In-Context LearningLanguage Modeling | CodeCode Available | 1 |
| Joint Spatio-Textual Reasoning for Answering Tourism Questions | Sep 28, 2020 | Spatial Reasoning | CodeCode Available | 1 |
| Knot So Simple: A Minimalistic Environment for Spatial Reasoning | May 23, 2025 | Model Predictive ControlSpatial Reasoning | CodeCode Available | 1 |
| SmartFreeEdit: Mask-Free Spatial-Aware Image Editing with Complex Instruction Understanding | Apr 17, 2025 | Image GenerationLarge Language Model | CodeCode Available | 1 |
| CityNavAgent: Aerial Vision-and-Language Navigation with Hierarchical Semantic Planning and Global Memory | May 8, 2025 | Large Language ModelNavigate | CodeCode Available | 1 |
| CityGPT: Empowering Urban Spatial Cognition of Large Language Models | Jun 20, 2024 | Code GenerationMath | CodeCode Available | 1 |
| Learning Action and Reasoning-Centric Image Editing from Videos and Simulations | Jul 3, 2024 | AttributeSpatial Reasoning | CodeCode Available | 1 |
| SpartQA: : A Textual Question Answering Benchmark for Spatial Reasoning | Apr 12, 2021 | Question AnsweringSpatial Reasoning | CodeCode Available | 1 |
| CityEQA: A Hierarchical LLM Agent on Embodied Question Answering Benchmark in City Space | Feb 18, 2025 | Embodied Question AnsweringQuestion Answering | CodeCode Available | 1 |
| Do Vision-Language Models Represent Space and How? Evaluating Spatial Frame of Reference Under Ambiguities | Oct 22, 2024 | Spatial Reasoning | CodeCode Available | 1 |
| ING-VP: MLLMs cannot Play Easy Vision-based Games Yet | Oct 9, 2024 | Spatial Reasoning | CodeCode Available | 1 |
| Improved Visual-Spatial Reasoning via R1-Zero-Like Training | Apr 1, 2025 | GPUSpatial Reasoning | CodeCode Available | 1 |
| DropPos: Pre-Training Vision Transformers by Reconstructing Dropped Positions | Sep 7, 2023 | PositionSpatial Reasoning | CodeCode Available | 1 |
| GuessWhat?! Visual object discovery through multi-modal dialogue | Nov 23, 2016 | ObjectObject Discovery | CodeCode Available | 1 |
| Capturing Shape Information with Multi-Scale Topological Loss Terms for 3D Reconstruction | Mar 3, 2022 | 3D ReconstructionSpatial Reasoning | CodeCode Available | 1 |
| HSPFormer: Hierarchical Spatial Perception Transformer for Semantic Segmentation | Jan 16, 2025 | Depth EstimationMonocular Depth Estimation | CodeCode Available | 1 |
| IndoNLI: A Natural Language Inference Dataset for Indonesian | Oct 27, 2021 | Natural Language InferenceSentence | CodeCode Available | 1 |
| Are Deep Neural Networks SMARTer than Second Graders? | Dec 20, 2022 | Language ModellingMeta-Learning | CodeCode Available | 1 |
| Geospatial Mechanistic Interpretability of Large Language Models | May 6, 2025 | Spatial Reasoning | CodeCode Available | 1 |
| From Seeing to Doing: Bridging Reasoning and Decision for Robotic Manipulation | May 13, 2025 | Robot ManipulationSpatial Reasoning | CodeCode Available | 1 |
| Grounded Chain-of-Thought for Multimodal Large Language Models | Mar 17, 2025 | HallucinationSpatial Reasoning | CodeCode Available | 1 |
| Unfolding Spatial Cognition: Evaluating Multimodal Models on Visual Simulations | Jun 5, 2025 | 4kSpatial Reasoning | CodeCode Available | 1 |
| Unsupervised Visual Chain-of-Thought Reasoning via Preference Optimization | Apr 25, 2025 | Spatial Reasoning | CodeCode Available | 1 |
| VideoCAD: A Large-Scale Video Dataset for Learning UI Interactions and 3D Reasoning from CAD Software | May 30, 2025 | Question AnsweringSpatial Reasoning | CodeCode Available | 1 |
| Can Large Language Models be Good Path Planners? A Benchmark and Investigation on Spatial-temporal Reasoning | Oct 5, 2023 | NavigateSpatial Reasoning | CodeCode Available | 1 |