| Towards Dynamic 3D Reconstruction of Hand-Instrument Interaction in Ophthalmic Surgery | May 23, 2025 | 3D ReconstructionHand Pose Estimation | —Unverified | 0 | 0 |
| Towards Embodied Cognition in Robots via Spatially Grounded Synthetic Worlds | May 20, 2025 | Spatial Reasoning | —Unverified | 0 | 0 |
| Towards Grounded Visual Spatial Reasoning in Multi-Modal Vision Language Models | Aug 18, 2023 | Image-text matchingObject Localization | —Unverified | 0 | 0 |
| Towards Navigation by Reasoning over Spatial Configurations | May 14, 2021 | Spatial Reasoning | —Unverified | 0 | 0 |
| Towards Visual Text Grounding of Multimodal Large Language Model | Apr 7, 2025 | BenchmarkingLanguage Modeling | —Unverified | 0 | 0 |
| U2-BENCH: Benchmarking Large Vision-Language Models on Ultrasound Understanding | May 23, 2025 | BenchmarkingSpatial Reasoning | —Unverified | 0 | 0 |
| UI-Vision: A Desktop-centric GUI Benchmark for Visual Perception and Interaction | Mar 19, 2025 | NavigateSpatial Reasoning | —Unverified | 0 | 0 |
| Unifying Map and Landmark Based Representations for Visual Navigation | Dec 21, 2017 | NavigateSpatial Reasoning | —Unverified | 0 | 0 |
| Unsupervised Representation Learning Facilitates Human-like Spatial Reasoning | Oct 12, 2021 | Representation LearningSpatial Reasoning | —Unverified | 0 | 0 |
| Video Perception Models for 3D Scene Synthesis | Jun 25, 2025 | 3D ReconstructionImage Generation | —Unverified | 0 | 0 |