| GUI-R1 : A Generalist R1-Style Vision-Language Action Model For GUI Agents | Apr 14, 2025 | Vision-Language-Action | CodeCode Available | 3 | 5 |
| A Comprehensive Survey on Continual Learning in Generative Models | Jun 16, 2025 | Continual LearningSurvey | CodeCode Available | 2 | 5 |
| CombatVLA: An Efficient Vision-Language-Action Model for Combat Tasks in 3D Action Role-Playing Games | Mar 12, 2025 | Decision MakingVision-Language-Action | CodeCode Available | 2 | 5 |
| Exploring the Adversarial Vulnerabilities of Vision-Language-Action Models in Robotics | Nov 18, 2024 | Vision-Language-Action | CodeCode Available | 2 | 5 |
| Exploring the Limits of Vision-Language-Action Manipulations in Cross-task Generalization | May 21, 2025 | Vision-Language-ActionZero-shot Generalization | CodeCode Available | 2 | 5 |
| BitVLA: 1-bit Vision-Language-Action Models for Robotics Manipulation | Jun 9, 2025 | QuantizationVision-Language-Action | CodeCode Available | 2 | 5 |
| An Embodied Generalist Agent in 3D World | Nov 18, 2023 | 3D dense captioning3D Question Answering (3D-QA) | CodeCode Available | 2 | 5 |
| Dita: Scaling Diffusion Transformer for Generalist Vision-Language-Action Policy | Mar 25, 2025 | DenoisingRobot Manipulation | CodeCode Available | 2 | 5 |
| Parallels Between VLA Model Post-Training and Human Motor Learning: Progress, Challenges, and Trends | Jun 26, 2025 | Action GenerationVision-Language-Action | CodeCode Available | 2 | 5 |
| DeeR-VLA: Dynamic Inference of Multimodal Large Language Models for Efficient Robot Execution | Nov 4, 2024 | GPURobot Manipulation | CodeCode Available | 2 | 5 |