| Interactive Post-Training for Vision-Language-Action Models | May 22, 2025 | Vision-Language-Action | —Unverified | 0 |
| DriveMoE: Mixture-of-Experts for Vision-Language-Action Model in End-to-End Autonomous Driving | May 22, 2025 | Autonomous DrivingBench2Drive | —Unverified | 0 |
| Perceptual Quality Assessment for Embodied AI | May 22, 2025 | Image Quality AssessmentVision-Language-Action | CodeCode Available | 0 |
| Object-Focus Actor for Data-efficient Robot Generalization Dexterous Manipulation | May 21, 2025 | ObjectPose Estimation | —Unverified | 0 |
| EndoVLA: Dual-Phase Vision-Language-Action Model for Autonomous Tracking in Endoscopy | May 21, 2025 | Motion PlanningVision-Language-Action | —Unverified | 0 |
| FLARE: Robot Learning with Implicit World Modeling | May 21, 2025 | Imitation LearningVision-Language-Action | —Unverified | 0 |
| Conditioning Matters: Training Diffusion Policies is Faster Than You Think | May 16, 2025 | Vision-Language-Action | —Unverified | 0 |
| RT-cache: Efficient Robot Trajectory Retrieval System | May 14, 2025 | RetrievalVision-Language-Action | —Unverified | 0 |
| Pixel Motion as Universal Representation for Robot Control | May 12, 2025 | Vision-Language-Action | —Unverified | 0 |
| 3D CAVLA: Leveraging Depth and 3D Context to Generalize Vision Language Action Models for Unseen Tasks | May 9, 2025 | Vision-Language-Action | —Unverified | 0 |