| From Seeing to Doing: Bridging Reasoning and Decision for Robotic Manipulation | May 13, 2025 | Robot ManipulationSpatial Reasoning | CodeCode Available | 1 |
| Benchmarking Vision, Language, & Action Models in Procedurally Generated, Open Ended Action Environments | May 8, 2025 | BenchmarkingPrompt Engineering | CodeCode Available | 1 |
| ChatVLA: Unified Multimodal Understanding and Robot Control with Vision-Language-Action Model | Feb 20, 2025 | Mixture-of-ExpertsQuestion Answering | CodeCode Available | 1 |
| DexVLA: Vision-Language Model with Plug-In Diffusion Expert for General Robot Control | Feb 9, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Benchmarking Vision, Language, & Action Models on Robotic Learning Tasks | Nov 4, 2024 | Action GenerationBenchmarking | CodeCode Available | 1 |
| Bridging Language, Vision and Action: Multimodal VAEs in Robotic Manipulation Tasks | Apr 2, 2024 | Vision-Language-Action | CodeCode Available | 1 |
| AnyPos: Automated Task-Agnostic Actions for Bimanual Manipulation | Jul 17, 2025 | Vision-Language-Action | —Unverified | 0 |
| LaViPlan : Language-Guided Visual Path Planning with RLVR | Jul 17, 2025 | Autonomous DrivingVision-Language-Action | —Unverified | 0 |
| Unified Vision-Language-Action Model | Jun 24, 2025 | Autonomous Drivingmodel | —Unverified | 0 |
| CronusVLA: Transferring Latent Motion Across Time for Multi-Frame Prediction in Manipulation | Jun 24, 2025 | ChunkingVision-Language-Action | —Unverified | 0 |