| Exploring the Adversarial Vulnerabilities of Vision-Language-Action Models in Robotics | Nov 18, 2024 | Vision-Language-Action | CodeCode Available | 2 |
| DeeR-VLA: Dynamic Inference of Multimodal Large Language Models for Efficient Robot Execution | Nov 4, 2024 | GPURobot Manipulation | CodeCode Available | 2 |
| Diffusion Transformer Policy | Oct 21, 2024 | DenoisingVision-Language-Action | CodeCode Available | 2 |
| TinyVLA: Towards Fast, Data-Efficient Vision-Language-Action Models for Robotic Manipulation | Sep 19, 2024 | Vision-Language-Action | CodeCode Available | 2 |
| An Embodied Generalist Agent in 3D World | Nov 18, 2023 | 3D dense captioning3D Question Answering (3D-QA) | CodeCode Available | 2 |
| RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic Control | Jul 28, 2023 | ObjectQuestion Answering | CodeCode Available | 2 |
| VOTE: Vision-Language-Action Optimization with Trajectory Ensemble Voting | Jul 7, 2025 | Depth EstimationVision-Language-Action | CodeCode Available | 1 |
| Adversarial Attacks on Robotic Vision Language Action Models | Jun 3, 2025 | Vision-Language-Action | CodeCode Available | 1 |
| ChatVLA-2: Vision-Language-Action Model with Open-World Embodied Reasoning from Pretrained Knowledge | May 28, 2025 | Imitation LearningMath | CodeCode Available | 1 |
| RoboFAC: A Comprehensive Framework for Robotic Failure Analysis and Correction | May 18, 2025 | Vision-Language-Action | CodeCode Available | 1 |