| VITA: Vision-to-Action Flow Matching Policy | Jul 17, 2025 | Action Generation | —Unverified | 0 |
| WorldVLA: Towards Autoregressive Action World Model | Jun 26, 2025 | Action Generationmodel | CodeCode Available | 4 |
| Parallels Between VLA Model Post-Training and Human Motor Learning: Progress, Challenges, and Trends | Jun 26, 2025 | Action GenerationVision-Language-Action | CodeCode Available | 2 |
| Flow-Based Single-Step Completion for Efficient and Expressive Policy Learning | Jun 26, 2025 | Action GenerationDecision Making | —Unverified | 0 |
| VLA-OS: Structuring and Dissecting Planning Representations and Paradigms in Vision-Language-Action Models | Jun 21, 2025 | Action GenerationContinual Learning | —Unverified | 0 |
| CodeDiffuser: Attention-Enhanced Diffusion Policy via VLM-Generated Code for Instruction Ambiguity | Jun 19, 2025 | Action GenerationContact-rich Manipulation | —Unverified | 0 |
| AutoVLA: A Vision-Language-Action Model for End-to-End Autonomous Driving with Adaptive Reasoning and Reinforcement Fine-Tuning | Jun 16, 2025 | Action GenerationAutonomous Driving | CodeCode Available | 3 |
| Block-wise Adaptive Caching for Accelerating Diffusion Policy | Jun 16, 2025 | Action GenerationDenoising | —Unverified | 0 |
| Time-Unified Diffusion Policy with Action Discrimination for Robotic Manipulation | Jun 11, 2025 | Action GenerationAction Recognition | —Unverified | 0 |
| An Open-Source Software Toolkit & Benchmark Suite for the Evaluation and Adaptation of Multimodal Action Models | Jun 10, 2025 | Action GenerationImage Captioning | —Unverified | 0 |
| FreqPolicy: Efficient Flow-based Visuomotor Policy via Frequency Consistency | Jun 10, 2025 | Action GenerationImage Generation | —Unverified | 0 |
| Time to Talk: LLM Agents for Asynchronous Group Communication in Mafia Games | Jun 5, 2025 | Action GenerationAsynchronous Group Communication | CodeCode Available | 1 |
| STAR: Learning Diverse Robot Skill Abstractions through Rotation-Augmented Vector Quantization | Jun 4, 2025 | Action GenerationQuantization | CodeCode Available | 0 |
| OWMM-Agent: Open World Mobile Manipulation With Multi-modal Agentic Data Synthesis | Jun 4, 2025 | Action GenerationDecision Making | CodeCode Available | 1 |
| SmolVLA: A Vision-Language-Action Model for Affordable and Efficient Robotics | Jun 2, 2025 | Action GenerationGPU | CodeCode Available | 11 |
| Towards a Generalizable Bimanual Foundation Policy via Flow-based Video Prediction | May 30, 2025 | Action GenerationOptical Flow Estimation | —Unverified | 0 |
| Hierarchical Instruction-aware Embodied Visual Tracking | May 27, 2025 | Action GenerationPosition | —Unverified | 0 |
| ProphetDWM: A Driving World Model for Rolling Out Future Actions and Videos | May 24, 2025 | Action GenerationAutonomous Driving | —Unverified | 0 |
| Distilling LLM Agent into Small Models with Retrieval and Code Tools | May 23, 2025 | Action GenerationDomain Generalization | CodeCode Available | 3 |
| FinePhys: Fine-grained Human Action Generation by Explicitly Incorporating Physical Laws for Effective Skeletal Guidance | May 19, 2025 | Action GenerationHuman action generation | —Unverified | 0 |
| LLM-Explorer: Towards Efficient and Affordable LLM-based Exploration for Mobile Apps | May 15, 2025 | Action Generation | CodeCode Available | 1 |
| Mini Diffuser: Fast Multi-task Diffusion Policy Training Using Two-level Mini-batches | May 14, 2025 | Action GenerationImage Generation | CodeCode Available | 1 |
| H^3DP: Triply-Hierarchical Diffusion Policy for Visuomotor Learning | May 12, 2025 | Action Generation | —Unverified | 0 |
| STORY2GAME: Generating (Almost) Everything in an Interactive Fiction Game | May 6, 2025 | Action GenerationCode Generation | —Unverified | 0 |
| A Survey on GUI Agents with Foundation Models Enhanced by Reinforcement Learning | Apr 29, 2025 | Action GenerationPrompt Engineering | —Unverified | 0 |
| Efficient Listener: Dyadic Facial Motion Synthesis via Action Diffusion | Apr 29, 2025 | Action GenerationFAD | —Unverified | 0 |
| SPECI: Skill Prompts based Hierarchical Continual Imitation Learning for Robot Manipulation | Apr 22, 2025 | Action GenerationImitation Learning | —Unverified | 0 |
| Modality Selection and Skill Segmentation via Cross-Modality Attention | Apr 20, 2025 | Action GenerationContact-rich Manipulation | —Unverified | 0 |
| InfiGUI-R1: Advancing Multimodal GUI Agents from Reactive Actors to Deliberative Reasoners | Apr 19, 2025 | Action GenerationLogical Reasoning | CodeCode Available | 2 |
| Prior Does Matter: Visual Navigation via Denoising Diffusion Bridge Models | Apr 14, 2025 | Action GenerationDenoising | CodeCode Available | 2 |
| A Survey on (M)LLM-Based GUI Agents | Mar 27, 2025 | Action GenerationInformation Retrieval | —Unverified | 0 |
| LLM Agents That Act Like Us: Accurate Human Behavior Simulation with Real-World Data | Mar 26, 2025 | Action Generation | —Unverified | 0 |
| ORION: A Holistic End-to-End Autonomous Driving Framework by Vision-Language Instructed Action Generation | Mar 25, 2025 | Action GenerationAutonomous Driving | —Unverified | 0 |
| Mind with Eyes: from Language Reasoning to Multimodal Reasoning | Mar 23, 2025 | Action GenerationMultimodal Reasoning | —Unverified | 0 |
| PRIMAL: Physically Reactive and Interactive Motor Model for Avatar Learning | Mar 21, 2025 | Action GenerationMotion Generation | —Unverified | 0 |
| Diffuse-CLoC: Guided Diffusion for Physics-based Character Look-ahead Control | Mar 14, 2025 | Action GenerationMotion Generation | —Unverified | 0 |
| TLA: Tactile-Language-Action Model for Contact-Rich Manipulation | Mar 11, 2025 | Action GenerationContact-rich Manipulation | —Unverified | 0 |
| Agent models: Internalizing Chain-of-Action Generation into Reasoning models | Mar 9, 2025 | Action GenerationReinforcement Learning (RL) | CodeCode Available | 2 |
| LiteWebAgent: The Open-Source Suite for VLM-Based Web-Agent Applications | Mar 4, 2025 | Action Generation | CodeCode Available | 2 |
| FRMD: Fast Robot Motion Diffusion with Consistency-Distilled Movement Primitives for Smooth Action Generation | Mar 3, 2025 | Action GenerationDenoising | —Unverified | 0 |
| What Makes a Good Diffusion Planner for Decision Making? | Mar 1, 2025 | Action GenerationDecision Making | CodeCode Available | 2 |
| Why Are Web AI Agents More Vulnerable Than Standalone LLMs? A Security Analysis | Feb 27, 2025 | Action GenerationAI Agent | —Unverified | 0 |
| Fine-Tuning Vision-Language-Action Models: Optimizing Speed and Success | Feb 27, 2025 | Action GenerationChunking | CodeCode Available | 5 |
| VDT-Auto: End-to-end Autonomous Driving with VLM-Guided Diffusion Transformers | Feb 27, 2025 | Action GenerationAutonomous Driving | —Unverified | 0 |
| Evolution 6.0: Evolving Robotic Capabilities Through Generative Design | Feb 24, 2025 | Action GenerationText to 3D | —Unverified | 0 |
| PMAT: Optimizing Action Generation Order in Multi-Agent Reinforcement Learning | Feb 23, 2025 | Action GenerationDecision Making | CodeCode Available | 0 |
| SALSA-RL: Stability Analysis in the Latent Space of Actions for Reinforcement Learning | Feb 21, 2025 | Action GenerationDecoder | —Unverified | 0 |
| IMLE Policy: Fast and Sample Efficient Visuomotor Policy Learning via Implicit Maximum Likelihood Estimation | Feb 17, 2025 | Action GenerationImitation Learning | —Unverified | 0 |
| Large Language Models for Multi-Robot Systems: A Survey | Feb 6, 2025 | Action GenerationBenchmarking | CodeCode Available | 1 |
| Flow Q-Learning | Feb 4, 2025 | Action GenerationD4RL | CodeCode Available | 3 |