| Atari-GPT: Benchmarking Multimodal Large Language Models as Low-Level Policies in Atari Games | Aug 28, 2024 | Atari GamesBenchmarking | —Unverified | 0 |
| Improved Algorithms for Allen's Interval Algebra by Dynamic Programming with Sublinear Partitioning | May 25, 2023 | Spatial Reasoning | —Unverified | 0 |
| Space-LLaVA: a Vision-Language Model Adapted to Extraterrestrial Applications | Aug 12, 2024 | Instruction FollowingLanguage Modeling | —Unverified | 0 |
| ImmerseGen: Agent-Guided Immersive World Generation with Alpha-Textured Proxies | Jun 17, 2025 | Scene GenerationSpatial Reasoning | —Unverified | 0 |
| A Light and Smart Wearable Platform with Multimodal Foundation Model for Enhanced Spatial Reasoning in People with Blindness and Low Vision | May 16, 2025 | Large Language ModelNavigate | —Unverified | 0 |
| Learning to encode spatial relations from natural language | May 1, 2019 | Spatial Reasoning | —Unverified | 0 |
| Long Range Arena : A Benchmark for Efficient Transformers | Jan 1, 2021 | 16kBenchmarking | —Unverified | 0 |
| Controllable Text-to-Image Generation with GPT-4 | May 29, 2023 | Image GenerationInstruction Following | —Unverified | 0 |
| How to Enable LLM with 3D Capacity? A Survey of Spatial Reasoning in LLM | Apr 8, 2025 | Autonomous VehiclesSpatial Reasoning | —Unverified | 0 |
| A Symbolic Representation of Human Posture for Interpretable Learning and Reasoning | Oct 17, 2022 | Activity RecognitionSpatial Reasoning | —Unverified | 0 |
| History-Aware Question Answering in a Blocks World Dialogue System | May 26, 2020 | Natural Language UnderstandingQuestion Answering | —Unverified | 0 |
| Hierarchical Spatio-temporal Decoupling for Text-to-Video Generation | Dec 7, 2023 | Spatial ReasoningText-to-Video Generation | —Unverified | 0 |
| Contrastive Region Guidance: Improving Grounding in Vision-Language Models without Training | Mar 4, 2024 | MathPhrase Grounding | —Unverified | 0 |
| Hyperdimensional Computing with Spiking-Phasor Neurons | Feb 28, 2023 | Spatial Reasoning | —Unverified | 0 |
| I Know About "Up"! Enhancing Spatial Reasoning in Visual Language Models Through 3D Reconstruction | Jul 19, 2024 | 3D ReconstructionSpatial Reasoning | —Unverified | 0 |
| HAMMR: HierArchical MultiModal React agents for generic VQA | Apr 8, 2024 | Optical Character Recognition (OCR)Question Answering | —Unverified | 0 |
| Contextual Reasoning for Scene Generation (Technical Report) | May 3, 2023 | Scene GenerationSpatial Reasoning | —Unverified | 0 |
| A Survey of Large Language Model-Powered Spatial Intelligence Across Scales: Advances in Embodied Agents, Smart Cities, and Earth Science | Apr 14, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Large Language Models and Mathematical Reasoning Failures | Feb 17, 2025 | Mathematical ReasoningPhysical Intuition | —Unverified | 0 |
| GSR-BENCH: A Benchmark for Grounded Spatial Reasoning Evaluation via Multimodal LLMs | Jun 19, 2024 | Spatial ReasoningVisual Reasoning | —Unverified | 0 |
| A Survey for Foundation Models in Autonomous Driving | Feb 2, 2024 | 3D Object DetectionAutonomous Driving | —Unverified | 0 |
| Incentivizing Multimodal Reasoning in Large Models for Direct Robot Manipulation | May 19, 2025 | Multimodal ReasoningRobot Manipulation | —Unverified | 0 |
| A LLM Benchmark based on the Minecraft Builder Dialog Agent Task | Jul 17, 2024 | MathMinecraft | —Unverified | 0 |
| LanguageRefer: Spatial-Language Model for 3D Visual Grounding | Jul 7, 2021 | 3D visual groundingLanguage Modeling | —Unverified | 0 |
| Grounded Reinforcement Learning for Visual Reasoning | May 29, 2025 | reinforcement-learningReinforcement Learning | —Unverified | 0 |