| SAMST: A Transformer framework based on SAM pseudo label filtering for remote sensing semi-supervised semantic segmentation | Jul 16, 2025 | Boundary DetectionPseudo Label | —Unverified | 0 |
| Towards Depth Foundation Model: Recent Trends in Vision-Based Depth Estimation | Jul 15, 2025 | 3D ReconstructionAutonomous Driving | —Unverified | 0 |
| PoseLLM: Enhancing Language-Guided Human Pose Estimation with MLP Alignment | Jul 12, 2025 | Large Language ModelPose Estimation | CodeCode Available | 0 |
| Go to Zero: Towards Zero-shot Motion Generation with Million-scale Data | Jul 9, 2025 | Motion GenerationZero-shot Generalization | CodeCode Available | 0 |
| Video Event Reasoning and Prediction by Fusing World Knowledge from LLMs with Vision Foundation Models | Jul 8, 2025 | Future predictionLarge Language Model | —Unverified | 0 |
| Helping CLIP See Both the Forest and the Trees: A Decomposition and Description Approach | Jul 4, 2025 | AttributeContrastive Learning | —Unverified | 0 |
| RobuSTereo: Robust Zero-Shot Stereo Matching under Adverse Weather | Jul 2, 2025 | DenoisingDepth Estimation | —Unverified | 0 |
| TRACED: Transition-aware Regret Approximation with Co-learnability for Environment Design | Jun 24, 2025 | Deep Reinforcement LearningZero-shot Generalization | CodeCode Available | 0 |
| VisLanding: Monocular 3D Perception for UAV Safe Landing via Depth-Normal Synergy | Jun 17, 2025 | Decision MakingSemantic Segmentation | —Unverified | 0 |
| LeVERB: Humanoid Whole-Body Control with Latent Vision-Language Instruction | Jun 16, 2025 | Instruction FollowingVision-Language-Action | —Unverified | 0 |