| Run-time Observation Interventions Make Vision-Language-Action Models More Visually Robust | Oct 2, 2024 | Vision-Language-Action | —Unverified | 0 |
| ReVLA: Reverting Visual Domain Limitation of Robotic Foundation Models | Sep 23, 2024 | Vision-Language-Action | —Unverified | 0 |
| Manipulation Facing Threats: Evaluating Physical Vulnerabilities in End-to-End Vision Language Action Models | Sep 20, 2024 | Vision-Language-Action | —Unverified | 0 |
| HiRT: Enhancing Robotic Control with Hierarchical Robot Transformers | Sep 12, 2024 | Vision-Language-Action | —Unverified | 0 |
| OccLLaMA: An Occupancy-Language-Action Generative World Model for Autonomous Driving | Sep 5, 2024 | Autonomous DrivingMotion Planning | —Unverified | 0 |
| CoVLA: Comprehensive Vision-Language-Action Dataset for Autonomous Driving | Aug 19, 2024 | Autonomous DrivingCaption Generation | —Unverified | 0 |
| Robotic Control via Embodied Chain-of-Thought Reasoning | Jul 11, 2024 | Vision-Language-Action | —Unverified | 0 |
| Mobility VLA: Multimodal Instruction Navigation with Long-Context VLMs and Topological Graphs | Jul 10, 2024 | Common Sense ReasoningVision-Language-Action | —Unverified | 0 |
| OmniJARVIS: Unified Vision-Language-Action Tokenization Enables Open-World Instruction Following Agents | Jun 27, 2024 | DecoderImitation Learning | —Unverified | 0 |
| Towards Natural Language-Driven Assembly Using Foundation Models | Jun 23, 2024 | FrictionVision-Language-Action | —Unverified | 0 |