SOTAVerified

Action Generation

Papers

Showing 150 of 111 papers

TitleStatusHype
VITA: Vision-to-Action Flow Matching Policy0
WorldVLA: Towards Autoregressive Action World ModelCode4
Parallels Between VLA Model Post-Training and Human Motor Learning: Progress, Challenges, and TrendsCode2
Flow-Based Single-Step Completion for Efficient and Expressive Policy Learning0
VLA-OS: Structuring and Dissecting Planning Representations and Paradigms in Vision-Language-Action Models0
CodeDiffuser: Attention-Enhanced Diffusion Policy via VLM-Generated Code for Instruction Ambiguity0
AutoVLA: A Vision-Language-Action Model for End-to-End Autonomous Driving with Adaptive Reasoning and Reinforcement Fine-TuningCode3
Block-wise Adaptive Caching for Accelerating Diffusion Policy0
Time-Unified Diffusion Policy with Action Discrimination for Robotic Manipulation0
An Open-Source Software Toolkit & Benchmark Suite for the Evaluation and Adaptation of Multimodal Action Models0
FreqPolicy: Efficient Flow-based Visuomotor Policy via Frequency Consistency0
Time to Talk: LLM Agents for Asynchronous Group Communication in Mafia GamesCode1
STAR: Learning Diverse Robot Skill Abstractions through Rotation-Augmented Vector QuantizationCode0
OWMM-Agent: Open World Mobile Manipulation With Multi-modal Agentic Data SynthesisCode1
SmolVLA: A Vision-Language-Action Model for Affordable and Efficient RoboticsCode11
Towards a Generalizable Bimanual Foundation Policy via Flow-based Video Prediction0
Hierarchical Instruction-aware Embodied Visual Tracking0
ProphetDWM: A Driving World Model for Rolling Out Future Actions and Videos0
Distilling LLM Agent into Small Models with Retrieval and Code ToolsCode3
FinePhys: Fine-grained Human Action Generation by Explicitly Incorporating Physical Laws for Effective Skeletal Guidance0
LLM-Explorer: Towards Efficient and Affordable LLM-based Exploration for Mobile AppsCode1
Mini Diffuser: Fast Multi-task Diffusion Policy Training Using Two-level Mini-batchesCode1
H^3DP: Triply-Hierarchical Diffusion Policy for Visuomotor Learning0
STORY2GAME: Generating (Almost) Everything in an Interactive Fiction Game0
A Survey on GUI Agents with Foundation Models Enhanced by Reinforcement Learning0
Efficient Listener: Dyadic Facial Motion Synthesis via Action Diffusion0
SPECI: Skill Prompts based Hierarchical Continual Imitation Learning for Robot Manipulation0
Modality Selection and Skill Segmentation via Cross-Modality Attention0
InfiGUI-R1: Advancing Multimodal GUI Agents from Reactive Actors to Deliberative ReasonersCode2
Prior Does Matter: Visual Navigation via Denoising Diffusion Bridge ModelsCode2
A Survey on (M)LLM-Based GUI Agents0
LLM Agents That Act Like Us: Accurate Human Behavior Simulation with Real-World Data0
ORION: A Holistic End-to-End Autonomous Driving Framework by Vision-Language Instructed Action Generation0
Mind with Eyes: from Language Reasoning to Multimodal Reasoning0
PRIMAL: Physically Reactive and Interactive Motor Model for Avatar Learning0
Diffuse-CLoC: Guided Diffusion for Physics-based Character Look-ahead Control0
TLA: Tactile-Language-Action Model for Contact-Rich Manipulation0
Agent models: Internalizing Chain-of-Action Generation into Reasoning modelsCode2
LiteWebAgent: The Open-Source Suite for VLM-Based Web-Agent ApplicationsCode2
FRMD: Fast Robot Motion Diffusion with Consistency-Distilled Movement Primitives for Smooth Action Generation0
What Makes a Good Diffusion Planner for Decision Making?Code2
Why Are Web AI Agents More Vulnerable Than Standalone LLMs? A Security Analysis0
Fine-Tuning Vision-Language-Action Models: Optimizing Speed and SuccessCode5
VDT-Auto: End-to-end Autonomous Driving with VLM-Guided Diffusion Transformers0
Evolution 6.0: Evolving Robotic Capabilities Through Generative Design0
PMAT: Optimizing Action Generation Order in Multi-Agent Reinforcement LearningCode0
SALSA-RL: Stability Analysis in the Latent Space of Actions for Reinforcement Learning0
IMLE Policy: Fast and Sample Efficient Visuomotor Policy Learning via Implicit Maximum Likelihood Estimation0
Large Language Models for Multi-Robot Systems: A SurveyCode1
Flow Q-LearningCode3
Show:102550
← PrevPage 1 of 3Next →

No leaderboard results yet.