SOTAVerified

Action Generation

Papers

Showing 150 of 111 papers

TitleStatusHype
SmolVLA: A Vision-Language-Action Model for Affordable and Efficient RoboticsCode11
Large Action Models: From Inception to ImplementationCode9
Fine-Tuning Vision-Language-Action Models: Optimizing Speed and SuccessCode5
WorldVLA: Towards Autoregressive Action World ModelCode4
PokeLLMon: A Human-Parity Agent for Pokemon Battles with Large Language ModelsCode3
Flow Q-LearningCode3
AutoScraper: A Progressive Understanding Web Agent for Web Scraper GenerationCode3
Affordance-based Robot Manipulation with Flow MatchingCode3
Distilling LLM Agent into Small Models with Retrieval and Code ToolsCode3
AutoVLA: A Vision-Language-Action Model for End-to-End Autonomous Driving with Adaptive Reasoning and Reinforcement Fine-TuningCode3
Driving with LLMs: Fusing Object-Level Vector Modality for Explainable Autonomous DrivingCode2
Learning Physically Realizable Skills for Online Packing of General 3D ShapesCode2
What Makes a Good Diffusion Planner for Decision Making?Code2
Agent models: Internalizing Chain-of-Action Generation into Reasoning modelsCode2
Prior Does Matter: Visual Navigation via Denoising Diffusion Bridge ModelsCode2
InfiGUI-R1: Advancing Multimodal GUI Agents from Reactive Actors to Deliberative ReasonersCode2
Parallels Between VLA Model Post-Training and Human Motor Learning: Progress, Challenges, and TrendsCode2
LiteWebAgent: The Open-Source Suite for VLM-Based Web-Agent ApplicationsCode2
AICL: Action In-Context Learning for Video Diffusion ModelCode1
Mini Diffuser: Fast Multi-task Diffusion Policy Training Using Two-level Mini-batchesCode1
Structure-Aware Human-Action GenerationCode1
Wonderful Team: Zero-Shot Physical Task Planning with Visual LLMsCode1
Human Action Generation with Generative Adversarial NetworksCode1
Time to Talk: LLM Agents for Asynchronous Group Communication in Mafia GamesCode1
COMMA: Modeling Relationship among Motivations, Emotions and Actions in Language-based Human ActivitiesCode1
MUGL: Large Scale Multi Person Conditional Action Generation with LocomotionCode1
OWMM-Agent: Open World Mobile Manipulation With Multi-modal Agentic Data SynthesisCode1
Graph Constrained Reinforcement Learning for Natural Language Action SpacesCode1
Generative Adversarial Graph Convolutional Networks for Human Action SynthesisCode1
LLM-Explorer: Towards Efficient and Affordable LLM-based Exploration for Mobile AppsCode1
Action2Motion: Conditioned Generation of 3D Human MotionsCode1
Benchmarking Vision, Language, & Action Models on Robotic Learning TasksCode1
Keep CALM and Explore: Language Models for Action Generation in Text-based GamesCode1
EPO: Hierarchical LLM Agents with Environment Preference OptimizationCode1
ACT: Empowering Decision Transformer with Dynamic Programming via Advantage ConditioningCode1
Large Language Models for Multi-Robot Systems: A SurveyCode1
Translation-based Supervision for Policy Generation in Simultaneous Neural Machine TranslationCode0
Dynamic Compositional Graph Convolutional Network for Efficient Composite Human Motion PredictionCode0
Text Editing as Imitation GameCode0
Seg2Act: Global Context-aware Action Generation for Document Logical StructuringCode0
STAR: Learning Diverse Robot Skill Abstractions through Rotation-Augmented Vector QuantizationCode0
Mapping Instructions to Actions in 3D Environments with Visual Goal PredictionCode0
CogIntAc: Modeling the Relationships between Intention, Emotion and Action in Interactive Process from Cognitive PerspectiveCode0
PMAT: Optimizing Action Generation Order in Multi-Agent Reinforcement LearningCode0
Learning Diverse Stochastic Human-Action Generators by Learning Smooth Latent TransitionsCode0
Efficient Motion Planning for Automated Lane Change based on Imitation Learning and Mixed-Integer OptimizationCode0
Language-free Compositional Action Generation via Decoupling RefinementCode0
FLAG3D: A 3D Fitness Activity Dataset with Language InstructionCode0
JoTR: A Joint Transformer and Reinforcement Learning Framework for Dialog Policy LearningCode0
Efficient Listener: Dyadic Facial Motion Synthesis via Action Diffusion0
Show:102550
← PrevPage 1 of 3Next →

No leaderboard results yet.