| A Light and Smart Wearable Platform with Multimodal Foundation Model for Enhanced Spatial Reasoning in People with Blindness and Low Vision | May 16, 2025 | Large Language ModelNavigate | —Unverified | 0 | 0 |
| Leveraging LLMs for Mission Planning in Precision Agriculture | Jun 11, 2025 | Spatial Reasoning | —Unverified | 0 | 0 |
| 3D-MoE: A Mixture-of-Experts Multi-modal LLM for 3D Vision and Pose Diffusion via Rectified Flow | Jan 28, 2025 | Instruction FollowingMixture-of-Experts | —Unverified | 0 | 0 |
| 3DSRBench: A Comprehensive 3D Spatial Reasoning Benchmark | Dec 10, 2024 | Autonomous NavigationSpatial Reasoning | —Unverified | 0 | 0 |
| A Call for New Recipes to Enhance Spatial Reasoning in MLLMs | Apr 21, 2025 | Spatial Reasoning | —Unverified | 0 | 0 |
| ActionFlow: Equivariant, Accurate, and Efficient Policies with Spatially Symmetric Flow Matching | Sep 6, 2024 | Action GenerationSpatial Reasoning | —Unverified | 0 | 0 |
| Space-LLaVA: a Vision-Language Model Adapted to Extraterrestrial Applications | Aug 12, 2024 | Instruction FollowingLanguage Modeling | —Unverified | 0 | 0 |
| A dual contrastive framework | Dec 13, 2024 | Contrastive LearningDecoder | —Unverified | 0 | 0 |
| Advancing Egocentric Video Question Answering with Multimodal Large Language Models | Apr 6, 2025 | Object RecognitionQuestion Answering | —Unverified | 0 | 0 |
| AerialVG: A Challenging Benchmark for Aerial Visual Grounding by Exploring Positional Relations | Apr 10, 2025 | Spatial ReasoningVisual Grounding | —Unverified | 0 | 0 |