| A Survey for Foundation Models in Autonomous Driving | Feb 2, 2024 | 3D Object DetectionAutonomous Driving | —Unverified | 0 |
| Good at captioning, bad at counting: Benchmarking GPT-4V on Earth observation data | Jan 31, 2024 | BenchmarkingChange Detection | CodeCode Available | 0 |
| SpatialVLM: Endowing Vision-Language Models with Spatial Reasoning Capabilities | Jan 22, 2024 | Question AnsweringSpatial Reasoning | —Unverified | 0 |
| StarCraftImage: A Dataset For Prototyping Spatial Reasoning Methods For Multi-Agent Environments | Jan 9, 2024 | ImputationReinforcement Learning (RL) | —Unverified | 0 |
| Distortions in Judged Spatial Relations in Large Language Models | Jan 8, 2024 | MisconceptionsSpatial Reasoning | —Unverified | 0 |
| Location Aware Modular Biencoder for Tourism Question Answering | Jan 4, 2024 | Question AnsweringRetrieval | CodeCode Available | 0 |
| LiDAR-LLM: Exploring the Potential of Large Language Models for 3D LiDAR Understanding | Dec 21, 2023 | Instruction FollowingLanguage Modeling | —Unverified | 0 |
| Hierarchical Spatio-temporal Decoupling for Text-to-Video Generation | Dec 7, 2023 | Spatial ReasoningText-to-Video Generation | —Unverified | 0 |
| Inherent limitations of LLMs regarding spatial information | Dec 5, 2023 | Spatial Reasoning | CodeCode Available | 0 |
| Exploring and Improving the Spatial Reasoning Abilities of Large Language Models | Dec 2, 2023 | Spatial Reasoning | —Unverified | 0 |
| FollowEval: A Multi-Dimensional Benchmark for Assessing the Instruction-Following Capability of Large Language Models | Nov 16, 2023 | Instruction FollowingLogical Reasoning | —Unverified | 0 |
| Disentangling Extraction and Reasoning in Multi-hop Spatial Reasoning | Oct 25, 2023 | Spatial Reasoning | CodeCode Available | 0 |
| DepWiGNN: A Depth-wise Graph Neural Network for Multi-hop Spatial Reasoning in Text | Oct 19, 2023 | Graph Neural NetworkSpatial Reasoning | CodeCode Available | 0 |
| Evaluating Robustness of Visual Representations for Object Assembly Task Requiring Spatio-Geometrical Reasoning | Oct 15, 2023 | BenchmarkingSpatial Reasoning | —Unverified | 0 |
| Integrating Symbolic Reasoning into Neural Generative Models for Design Generation | Oct 13, 2023 | Spatial Reasoning | —Unverified | 0 |
| SlotGNN: Unsupervised Discovery of Multi-Object Representations and Visual Dynamics | Oct 6, 2023 | ObjectObject Discovery | —Unverified | 0 |
| An Evaluation of ChatGPT-4's Qualitative Spatial Reasoning Capabilities in RCC-8 | Sep 27, 2023 | Spatial Reasoning | —Unverified | 0 |
| Dense 2D-3D Indoor Prediction with Sound via Aligned Cross-Modal Distillation | Sep 20, 2023 | 3D Scene ReconstructionDepth Estimation | CodeCode Available | 0 |
| Multi-camera Bird's Eye View Perception for Autonomous Driving | Sep 16, 2023 | Autonomous DrivingSensor Fusion | —Unverified | 0 |
| STUPD: A Synthetic Dataset for Spatial and Temporal Relation Reasoning | Sep 13, 2023 | RelationRelationship Detection | CodeCode Available | 0 |
| Towards Grounded Visual Spatial Reasoning in Multi-Modal Vision Language Models | Aug 18, 2023 | Image-text matchingObject Localization | —Unverified | 0 |
| Object Goal Navigation with Recursive Implicit Maps | Aug 10, 2023 | NavigateObject | —Unverified | 0 |
| Spatial Intelligence of a Self-driving Car and Rule-Based Decision Making | Aug 2, 2023 | Autonomous DrivingDecision Making | —Unverified | 0 |
| SpaceNLI: Evaluating the Consistency of Predicting Inferences in Space | Jul 5, 2023 | Natural Language InferenceNegation | CodeCode Available | 0 |
| Controllable Text-to-Image Generation with GPT-4 | May 29, 2023 | Image GenerationInstruction Following | —Unverified | 0 |
| Neural Task Synthesis for Visual Programming | May 26, 2023 | Imitation LearningSpatial Reasoning | CodeCode Available | 0 |
| Improved Algorithms for Allen's Interval Algebra by Dynamic Programming with Sublinear Partitioning | May 25, 2023 | Spatial Reasoning | —Unverified | 0 |
| EgoHumans: An Egocentric 3D Multi-Human Benchmark | May 25, 2023 | 3D Pose EstimationHuman Detection | CodeCode Available | 0 |
| From Patches to Objects: Exploiting Spatial Reasoning for Better Visual Representations | May 21, 2023 | Contrastive LearningLinear evaluation | —Unverified | 0 |
| Contextual Reasoning for Scene Generation (Technical Report) | May 3, 2023 | Scene GenerationSpatial Reasoning | —Unverified | 0 |
| Dialectical language model evaluation: An initial appraisal of the commonsense spatial reasoning abilities of LLMs | Apr 22, 2023 | Language Model EvaluationLanguage Modeling | —Unverified | 0 |
| Are LLMs the Master of All Trades? : Exploring Domain-Agnostic Reasoning Skills of LLMs | Mar 22, 2023 | AllSpatial Reasoning | CodeCode Available | 0 |
| Morpho-logic from a Topos Perspective: Application to symbolic AI | Mar 8, 2023 | Spatial Reasoning | —Unverified | 0 |
| Hyperdimensional Computing with Spiking-Phasor Neurons | Feb 28, 2023 | Spatial Reasoning | —Unverified | 0 |
| A Pilot Evaluation of ChatGPT and DALL-E 2 on Decision Making and Spatial Reasoning | Feb 15, 2023 | Decision MakingSpatial Reasoning | —Unverified | 0 |
| Ego-Humans: An Ego-Centric 3D Multi-Human Benchmark | Jan 1, 2023 | 3D Pose EstimationHuman Detection | —Unverified | 0 |
| OpenD: A Benchmark for Language-Driven Door and Drawer Opening | Dec 10, 2022 | Spatial Reasoning | —Unverified | 0 |
| Location-Aware Self-Supervised Transformers for Semantic Segmentation | Dec 5, 2022 | Contrastive Learningimage-classification | —Unverified | 0 |
| Spatial Reasoning for Few-Shot Object Detection | Nov 2, 2022 | Data AugmentationFew-Shot Object Detection | —Unverified | 0 |
| A Symbolic Representation of Human Posture for Interpretable Learning and Reasoning | Oct 17, 2022 | Activity RecognitionSpatial Reasoning | —Unverified | 0 |
| LOViS: Learning Orientation and Visual Signals for Vision and Language Navigation | Sep 26, 2022 | Spatial ReasoningVision and Language Navigation | CodeCode Available | 0 |
| Toward 3D Spatial Reasoning for Human-like Text-based Visual Question Answering | Sep 21, 2022 | Image CaptioningOptical Character Recognition (OCR) | —Unverified | 0 |
| CASPER: Cognitive Architecture for Social Perception and Engagement in Robots | Sep 1, 2022 | Action RecognitionNavigate | —Unverified | 0 |
| Knowing Earlier what Right Means to You: A Comprehensive VQA Dataset for Grounding Relative Directions via Multi-Task Learning | Jul 6, 2022 | DiagnosticMulti-Task Learning | CodeCode Available | 0 |
| Translating Place-Related Questions to GeoSPARQL Queries | May 6, 2022 | Geographic Question AnsweringQuestion Answering | CodeCode Available | 0 |
| Explicit Object Relation Alignment for Vision and Language Navigation | May 1, 2022 | ObjectRelation | CodeCode Available | 0 |
| DeepSSN: a deep convolutional neural network to assess spatial scene similarity | Feb 7, 2022 | Data AugmentationInformation Retrieval | CodeCode Available | 0 |
| ReCLIP: A Strong Zero-Shot Baseline for Referring Expression Comprehension | Nov 16, 2021 | image-classificationImage Classification | —Unverified | 0 |
| Explicit Object Relation Alignment for Vision and Language Navigation | Nov 16, 2021 | Instruction FollowingRelation | —Unverified | 0 |
| Graph Relation Transformer: Incorporating pairwise object features into the Transformer architecture | Nov 11, 2021 | Graph AttentionQuestion Answering | —Unverified | 0 |