| SPARTQA: A Textual Question Answering Benchmark for Spatial Reasoning | Jun 1, 2021 | Question AnsweringSpatial Reasoning | CodeCode Available | 1 | 5 |
| Touchdown: Natural Language Navigation and Spatial Reasoning in Visual Street Environments | Nov 29, 2018 | PositionSpatial Reasoning | CodeCode Available | 1 | 5 |
| Enhancing Reasoning to Adapt Large Language Models for Domain-Specific Applications | Feb 5, 2025 | In-Context LearningLanguage Modeling | CodeCode Available | 1 | 5 |
| VideoCAD: A Large-Scale Video Dataset for Learning UI Interactions and 3D Reasoning from CAD Software | May 30, 2025 | Question AnsweringSpatial Reasoning | CodeCode Available | 1 | 5 |
| Bridging the Dynamic Perception Gap: Training-Free Draft Chain-of-Thought for Dynamic Multimodal Spatial Reasoning | May 22, 2025 | Spatial Reasoning | CodeCode Available | 0 | 5 |
| Representation Learning for Grounded Spatial Reasoning | Jul 13, 2017 | reinforcement-learningReinforcement Learning | CodeCode Available | 0 | 5 |
| Show, Don't Tell: Evaluating Large Language Models Beyond Textual Understanding with ChildPlay | Jul 12, 2024 | Spatial Reasoning | CodeCode Available | 0 | 5 |
| EgoHumans: An Egocentric 3D Multi-Human Benchmark | May 25, 2023 | 3D Pose EstimationHuman Detection | CodeCode Available | 0 | 5 |
| OST-Bench: Evaluating the Capabilities of MLLMs in Online Spatio-temporal Scene Understanding | Jul 10, 2025 | Scene UnderstandingSpatial Reasoning | CodeCode Available | 0 | 5 |
| Polymath: A Challenging Multi-modal Mathematical Reasoning Benchmark | Oct 6, 2024 | Mathematical ReasoningSpatial Reasoning | CodeCode Available | 0 | 5 |