| SmolVLA: A Vision-Language-Action Model for Affordable and Efficient Robotics | Jun 2, 2025 | Action GenerationGPU | CodeCode Available | 11 | 5 |
| OpenVLA: An Open-Source Vision-Language-Action Model | Jun 13, 2024 | Imitation LearningLanguage Modelling | CodeCode Available | 9 | 5 |
| UniVLA: Learning to Act Anywhere with Task-centric Latent Actions | May 9, 2025 | Robot ManipulationVision-Language-Action | CodeCode Available | 5 | 5 |
| Fine-Tuning Vision-Language-Action Models: Optimizing Speed and Success | Feb 27, 2025 | Action GenerationChunking | CodeCode Available | 5 | 5 |
| ShowUI: One Vision-Language-Action Model for GUI Visual Agent | Nov 26, 2024 | Instruction FollowingNatural Language Visual Grounding | CodeCode Available | 5 | 5 |
| OpenDriveVLA: Towards End-to-end Autonomous Driving with Large Vision Language Action Model | Mar 30, 2025 | Autonomous DrivingDecision Making | CodeCode Available | 4 | 5 |
| A Survey on Vision-Language-Action Models for Autonomous Driving | Jun 30, 2025 | Autonomous DrivingAutonomous Vehicles | CodeCode Available | 4 | 5 |
| A Survey on Vision-Language-Action Models for Embodied AI | May 23, 2024 | Image CaptioningInstruction Following | CodeCode Available | 4 | 5 |
| PointVLA: Injecting the 3D World into Vision-Language-Action Models | Mar 10, 2025 | Imitation LearningSpatial Reasoning | CodeCode Available | 4 | 5 |
| WorldVLA: Towards Autoregressive Action World Model | Jun 26, 2025 | Action Generationmodel | CodeCode Available | 4 | 5 |