| CoVLA: Comprehensive Vision-Language-Action Dataset for Autonomous Driving | Aug 19, 2024 | Autonomous DrivingCaption Generation | —Unverified | 0 | 0 |
| CronusVLA: Transferring Latent Motion Across Time for Multi-Frame Prediction in Manipulation | Jun 24, 2025 | ChunkingVision-Language-Action | —Unverified | 0 | 0 |
| DataPlatter: Boosting Robotic Manipulation Generalization with Minimal Costly Data | Mar 25, 2025 | Robot ManipulationSpatial Reasoning | —Unverified | 0 | 0 |
| DexGraspVLA: A Vision-Language-Action Framework Towards General Dexterous Grasping | Feb 28, 2025 | Imitation LearningVision-Language-Action | —Unverified | 0 | 0 |
| DriveAction: A Benchmark for Exploring Human-like Driving Decisions in VLA Models | Jun 6, 2025 | Autonomous DrivingAutonomous Vehicles | —Unverified | 0 | 0 |
| DriveMoE: Mixture-of-Experts for Vision-Language-Action Model in End-to-End Autonomous Driving | May 22, 2025 | Autonomous DrivingBench2Drive | —Unverified | 0 | 0 |
| EfficientVLA: Training-Free Acceleration and Compression for Vision-Language-Action Models | Jun 11, 2025 | Vision-Language-Action | —Unverified | 0 | 0 |
| Embodied AI with Foundation Models for Mobile Service Robots: A Systematic Review | May 26, 2025 | Decision Making Under UncertaintySensor Fusion | —Unverified | 0 | 0 |
| EndoVLA: Dual-Phase Vision-Language-Action Model for Autonomous Tracking in Endoscopy | May 21, 2025 | Motion PlanningVision-Language-Action | —Unverified | 0 | 0 |
| Evolution 6.0: Evolving Robotic Capabilities Through Generative Design | Feb 24, 2025 | Action GenerationText to 3D | —Unverified | 0 | 0 |
| FAST: Efficient Action Tokenization for Vision-Language-Action Models | Jan 16, 2025 | Vision-Language-Action | —Unverified | 0 | 0 |
| FLARE: Robot Learning with Implicit World Modeling | May 21, 2025 | Imitation LearningVision-Language-Action | —Unverified | 0 | 0 |
| ForceVLA: Enhancing VLA Models with a Force-aware MoE for Contact-rich Manipulation | May 28, 2025 | Contact-rich ManipulationMixture-of-Experts | —Unverified | 0 | 0 |
| From Intention to Execution: Probing the Generalization Boundaries of Vision-Language-Action Models | Jun 11, 2025 | Imitation LearningVision-Language-Action | —Unverified | 0 | 0 |
| General-purpose foundation models for increased autonomy in robot-assisted surgery | Jan 1, 2024 | Vision-Language-Action | —Unverified | 0 | 0 |
| GEVRM: Goal-Expressive Video Generation Model For Robust Visual Manipulation | Feb 13, 2025 | Contrastive LearningVideo Generation | —Unverified | 0 | 0 |
| GR00T N1: An Open Foundation Model for Generalist Humanoid Robots | Mar 18, 2025 | Imitation LearningVision-Language-Action | —Unverified | 0 | 0 |
| GRAPE: Generalizing Robot Policy via Preference Alignment | Nov 28, 2024 | Vision-Language-Action | —Unverified | 0 | 0 |
| Grounding Multimodal LLMs to Embodied Agents that Ask for Help with Reinforcement Learning | Apr 1, 2025 | Reinforcement Learning (RL)Vision-Language-Action | —Unverified | 0 | 0 |
| HAMSTER: Hierarchical Action Models For Open-World Robot Manipulation | Feb 8, 2025 | Robot ManipulationVision-Language-Action | —Unverified | 0 | 0 |
| Hi Robot: Open-Ended Instruction Following with Hierarchical Vision-Language-Action Models | Feb 26, 2025 | Instruction FollowingVision-Language-Action | —Unverified | 0 | 0 |
| HiRT: Enhancing Robotic Control with Hierarchical Robot Transformers | Sep 12, 2024 | Vision-Language-Action | —Unverified | 0 | 0 |
| Hume: Introducing System-2 Thinking in Visual-Language-Action Model | May 27, 2025 | DenoisingVision-Language-Action | —Unverified | 0 | 0 |
| HybridVLA: Collaborative Diffusion and Autoregression in a Unified Vision-Language-Action Model | Mar 13, 2025 | Common Sense ReasoningDenoising | —Unverified | 0 | 0 |
| Improving Vision-Language-Action Model with Online Reinforcement Learning | Jan 28, 2025 | reinforcement-learningReinforcement Learning | —Unverified | 0 | 0 |