| CLIPort: What and Where Pathways for Robotic Manipulation | Sep 24, 2021 | Imitation LearningRobotic Grasping | CodeCode Available | 1 |
| Improved Visual-Spatial Reasoning via R1-Zero-Like Training | Apr 1, 2025 | GPUSpatial Reasoning | CodeCode Available | 1 |
| CityNavAgent: Aerial Vision-and-Language Navigation with Hierarchical Semantic Planning and Global Memory | May 8, 2025 | Large Language ModelNavigate | CodeCode Available | 1 |
| CityGPT: Empowering Urban Spatial Cognition of Large Language Models | Jun 20, 2024 | Code GenerationMath | CodeCode Available | 1 |
| IndoNLI: A Natural Language Inference Dataset for Indonesian | Oct 27, 2021 | Natural Language InferenceSentence | CodeCode Available | 1 |
| iVISPAR -- An Interactive Visual-Spatial Reasoning Benchmark for VLMs | Feb 5, 2025 | Spatial Reasoning | CodeCode Available | 1 |
| Logic-RAG: Augmenting Large Multimodal Models with Visual-Spatial Knowledge for Road Scene Understanding | Mar 16, 2025 | Autonomous DrivingRAG | CodeCode Available | 1 |
| Joint Spatio-Textual Reasoning for Answering Tourism Questions | Sep 28, 2020 | Spatial Reasoning | CodeCode Available | 1 |
| CoNav: Collaborative Cross-Modal Reasoning for Embodied Navigation | May 22, 2025 | Scene UnderstandingSpatial Reasoning | CodeCode Available | 1 |
| CityEQA: A Hierarchical LLM Agent on Embodied Question Answering Benchmark in City Space | Feb 18, 2025 | Embodied Question AnsweringQuestion Answering | CodeCode Available | 1 |