| Pix2Shape: Towards Unsupervised Learning of 3D Scenes from Images using a View-based Representation | Mar 23, 2020 | DecoderSpatial Reasoning | CodeCode Available | 1 | 5 |
| Capturing Shape Information with Multi-Scale Topological Loss Terms for 3D Reconstruction | Mar 3, 2022 | 3D ReconstructionSpatial Reasoning | CodeCode Available | 1 | 5 |
| OpenKD: Opening Prompt Diversity for Zero- and Few-shot Keypoint Detection | Sep 30, 2024 | DiversityKeypoint Detection | CodeCode Available | 1 | 5 |
| NuPlanQA: A Large-Scale Dataset and Benchmark for Multi-View Driving Scene Understanding in Multi-Modal Large Language Models | Mar 17, 2025 | Question AnsweringScene Understanding | CodeCode Available | 1 | 5 |
| iVISPAR -- An Interactive Visual-Spatial Reasoning Benchmark for VLMs | Feb 5, 2025 | Spatial Reasoning | CodeCode Available | 1 | 5 |
| Joint Spatio-Textual Reasoning for Answering Tourism Questions | Sep 28, 2020 | Spatial Reasoning | CodeCode Available | 1 | 5 |
| Are Deep Neural Networks SMARTer than Second Graders? | Dec 20, 2022 | Language ModellingMeta-Learning | CodeCode Available | 1 | 5 |
| Learning Action and Reasoning-Centric Image Editing from Videos and Simulations | Jul 3, 2024 | AttributeSpatial Reasoning | CodeCode Available | 1 | 5 |
| On The Planning Abilities of OpenAI's o1 Models: Feasibility, Optimality, and Generalizability | Sep 30, 2024 | Decision MakingManagement | CodeCode Available | 1 | 5 |
| End-to-End Egospheric Spatial Memory | Feb 15, 2021 | General Reinforcement LearningImitation Learning | CodeCode Available | 1 | 5 |