| A Vision Centric Remote Sensing Benchmark | Mar 20, 2025 | Question AnsweringRepresentation Learning | —Unverified | 0 |
| OmniGeo: Towards a Multimodal Large Language Models for Geospatial Artificial Intelligence | Mar 20, 2025 | Instruction FollowingNatural Language Understanding | —Unverified | 0 |
| Sonata: Self-Supervised Learning of Reliable Point Representations | Mar 20, 2025 | 3D Semantic SegmentationSelf-Supervised Learning | CodeCode Available | 4 |
| Statistical applications of the 20/60/20 rule in risk management and portfolio optimization | Mar 19, 2025 | ManagementPortfolio Optimization | —Unverified | 0 |
| UI-Vision: A Desktop-centric GUI Benchmark for Visual Perception and Interaction | Mar 19, 2025 | NavigateSpatial Reasoning | —Unverified | 0 |
| CoSpace: Benchmarking Continuous Space Perception Ability for Vision-Language Models | Mar 18, 2025 | BenchmarkingSpatial Reasoning | CodeCode Available | 0 |
| NuPlanQA: A Large-Scale Dataset and Benchmark for Multi-View Driving Scene Understanding in Multi-Modal Large Language Models | Mar 17, 2025 | Question AnsweringScene Understanding | CodeCode Available | 1 |
| Free-form language-based robotic reasoning and grasping | Mar 17, 2025 | FormRobotic Grasping | CodeCode Available | 2 |
| Grounded Chain-of-Thought for Multimodal Large Language Models | Mar 17, 2025 | HallucinationSpatial Reasoning | CodeCode Available | 1 |
| Logic-RAG: Augmenting Large Multimodal Models with Visual-Spatial Knowledge for Road Scene Understanding | Mar 16, 2025 | Autonomous DrivingRAG | CodeCode Available | 1 |