| VITA: Vision-to-Action Flow Matching Policy | Jul 17, 2025 | Action Generation | —Unverified | 0 |
| Flow-Based Single-Step Completion for Efficient and Expressive Policy Learning | Jun 26, 2025 | Action GenerationDecision Making | —Unverified | 0 |
| Parallels Between VLA Model Post-Training and Human Motor Learning: Progress, Challenges, and Trends | Jun 26, 2025 | Action GenerationVision-Language-Action | CodeCode Available | 2 |
| WorldVLA: Towards Autoregressive Action World Model | Jun 26, 2025 | Action Generationmodel | CodeCode Available | 4 |
| VLA-OS: Structuring and Dissecting Planning Representations and Paradigms in Vision-Language-Action Models | Jun 21, 2025 | Action GenerationContinual Learning | —Unverified | 0 |
| CodeDiffuser: Attention-Enhanced Diffusion Policy via VLM-Generated Code for Instruction Ambiguity | Jun 19, 2025 | Action GenerationContact-rich Manipulation | —Unverified | 0 |
| Block-wise Adaptive Caching for Accelerating Diffusion Policy | Jun 16, 2025 | Action GenerationDenoising | —Unverified | 0 |
| AutoVLA: A Vision-Language-Action Model for End-to-End Autonomous Driving with Adaptive Reasoning and Reinforcement Fine-Tuning | Jun 16, 2025 | Action GenerationAutonomous Driving | CodeCode Available | 3 |
| Time-Unified Diffusion Policy with Action Discrimination for Robotic Manipulation | Jun 11, 2025 | Action GenerationAction Recognition | —Unverified | 0 |
| An Open-Source Software Toolkit & Benchmark Suite for the Evaluation and Adaptation of Multimodal Action Models | Jun 10, 2025 | Action GenerationImage Captioning | —Unverified | 0 |
| FreqPolicy: Efficient Flow-based Visuomotor Policy via Frequency Consistency | Jun 10, 2025 | Action GenerationImage Generation | —Unverified | 0 |
| Time to Talk: LLM Agents for Asynchronous Group Communication in Mafia Games | Jun 5, 2025 | Action GenerationAsynchronous Group Communication | CodeCode Available | 1 |
| OWMM-Agent: Open World Mobile Manipulation With Multi-modal Agentic Data Synthesis | Jun 4, 2025 | Action GenerationDecision Making | CodeCode Available | 1 |
| STAR: Learning Diverse Robot Skill Abstractions through Rotation-Augmented Vector Quantization | Jun 4, 2025 | Action GenerationQuantization | CodeCode Available | 0 |
| SmolVLA: A Vision-Language-Action Model for Affordable and Efficient Robotics | Jun 2, 2025 | Action GenerationGPU | CodeCode Available | 11 |
| Towards a Generalizable Bimanual Foundation Policy via Flow-based Video Prediction | May 30, 2025 | Action GenerationOptical Flow Estimation | —Unverified | 0 |
| Hierarchical Instruction-aware Embodied Visual Tracking | May 27, 2025 | Action GenerationPosition | —Unverified | 0 |
| ProphetDWM: A Driving World Model for Rolling Out Future Actions and Videos | May 24, 2025 | Action GenerationAutonomous Driving | —Unverified | 0 |
| Distilling LLM Agent into Small Models with Retrieval and Code Tools | May 23, 2025 | Action GenerationDomain Generalization | CodeCode Available | 3 |
| FinePhys: Fine-grained Human Action Generation by Explicitly Incorporating Physical Laws for Effective Skeletal Guidance | May 19, 2025 | Action GenerationHuman action generation | —Unverified | 0 |
| LLM-Explorer: Towards Efficient and Affordable LLM-based Exploration for Mobile Apps | May 15, 2025 | Action Generation | CodeCode Available | 1 |
| Mini Diffuser: Fast Multi-task Diffusion Policy Training Using Two-level Mini-batches | May 14, 2025 | Action GenerationImage Generation | CodeCode Available | 1 |
| H^3DP: Triply-Hierarchical Diffusion Policy for Visuomotor Learning | May 12, 2025 | Action Generation | —Unverified | 0 |
| STORY2GAME: Generating (Almost) Everything in an Interactive Fiction Game | May 6, 2025 | Action GenerationCode Generation | —Unverified | 0 |
| A Survey on GUI Agents with Foundation Models Enhanced by Reinforcement Learning | Apr 29, 2025 | Action GenerationPrompt Engineering | —Unverified | 0 |