Supervised Fine Tuning on Curated Data is Reinforcement Learning (and can be improved) Jul 17, 2025 continuous-control Continuous Control
— Unverified 0rQdia: Regularizing Q-Value Distributions With Image Augmentation Jun 26, 2025 continuous-control Continuous Control
— Unverified 0Sparse-Reg: Improving Sample Complexity in Offline Reinforcement Learning using Sparsity Jun 20, 2025 continuous-control Continuous Control
Code Code Available 0Fractional Reasoning via Latent Steering Vectors Improves Inference Time Compute Jun 18, 2025 continuous-control Continuous Control
— Unverified 0Scaling Algorithm Distillation for Continuous Control with Mamba Jun 16, 2025 continuous-control Continuous Control
— Unverified 0DR-SAC: Distributionally Robust Soft Actor-Critic for Reinforcement Learning under Uncertainty Jun 14, 2025 continuous-control Continuous Control
Code Code Available 0Wasserstein Barycenter Soft Actor-Critic Jun 11, 2025 continuous-control Continuous Control
— Unverified 0Reinforcement Learning via Implicit Imitation Guidance Jun 9, 2025 continuous-control Continuous Control
— Unverified 0BEAST: Efficient Tokenization of B-Splines Encoded Action Sequences for Imitation Learning Jun 6, 2025 continuous-control Continuous Control
— Unverified 0AutoQD: Automatic Discovery of Diverse Behaviors with Quality-Diversity Optimization Jun 5, 2025 continuous-control Continuous Control
— Unverified 0Safe Planning and Policy Optimization via World Model Learning Jun 5, 2025 continuous-control Continuous Control
— Unverified 0Self-Composing Policies for Scalable Continual Reinforcement Learning Jun 4, 2025 continuous-control Continuous Control
— Unverified 0Unsupervised Meta-Testing with Conditional Neural Processes for Hybrid Meta-Reinforcement Learning Jun 4, 2025 continuous-control Continuous Control
— Unverified 0Proxy Target: Bridging the Gap Between Discrete Spiking Neural Networks and Continuous Control May 30, 2025 continuous-control Continuous Control
— Unverified 0DATD3: Depthwise Attention Twin Delayed Deep Deterministic Policy Gradient For Model Free Reinforcement Learning Under Output Feedback Control May 29, 2025 continuous-control Continuous Control
— Unverified 0Equivalence of stochastic and deterministic policy gradients May 29, 2025 continuous-control Continuous Control
— Unverified 0Knowledge Insulating Vision-Language-Action Models: Train Fast, Run Fast, Generalize Better May 29, 2025 continuous-control Continuous Control
— Unverified 0Improving Value Estimation Critically Enhances Vanilla Policy Gradient May 25, 2025 continuous-control Continuous Control
Code Code Available 0RLBenchNet: The Right Network for the Right Reinforcement Learning Task May 21, 2025 continuous-control Continuous Control
Code Code Available 1World Models as Reference Trajectories for Rapid Motor Adaptation May 21, 2025 continuous-control Continuous Control
— Unverified 0AM-PPO: (Advantage) Alpha-Modulation with Proximal Policy Optimization May 21, 2025 continuous-control Continuous Control
— Unverified 0Guided Policy Optimization under Partial Observability May 21, 2025 continuous-control Continuous Control
Code Code Available 0Sample and Computationally Efficient Continuous-Time Reinforcement Learning with General Function Approximation May 20, 2025 Computational Efficiency continuous-control
Code Code Available 0KIPPO: Koopman-Inspired Proximal Policy Optimization May 20, 2025 Computational Efficiency continuous-control
— Unverified 0CIE: Controlling Language Model Text Generations Using Continuous Signals May 19, 2025 continuous-control Continuous Control
Code Code Available 0Bi-Level Policy Optimization with Nyström Hypergradients May 16, 2025 Bilevel Optimization continuous-control
— Unverified 0Monte Carlo Beam Search for Actor-Critic Reinforcement Learning in Continuous Control May 13, 2025 Computational Efficiency continuous-control
— Unverified 0Adaptive Diffusion Policy Optimization for Robotic Manipulation May 13, 2025 continuous-control Continuous Control
Code Code Available 0Cache-Efficient Posterior Sampling for Reinforcement Learning with LLM-Derived Priors Across Discrete and Continuous Domains May 12, 2025 continuous-control Continuous Control
— Unverified 0Offline Multi-agent Reinforcement Learning via Score Decomposition May 9, 2025 continuous-control Continuous Control
— Unverified 0Enhanced Robust Tracking Control: An Online Learning Approach May 8, 2025 continuous-control Continuous Control
Code Code Available 0CLAM: Continuous Latent Action Models for Robot Learning from Unlabeled Demonstrations May 8, 2025 continuous-control Continuous Control
— Unverified 0Policy-labeled Preference Learning: Is Preference Enough for RLHF? May 6, 2025 continuous-control Continuous Control
— Unverified 0Surrogate Fitness Metrics for Interpretable Reinforcement Learning Apr 20, 2025 continuous-control Continuous Control
— Unverified 0TraCeS: Trajectory Based Credit Assignment From Sparse Safety Feedback Apr 17, 2025 continuous-control Continuous Control
— Unverified 0Probabilistic Curriculum Learning for Goal-Based Reinforcement Learning Apr 2, 2025 continuous-control Continuous Control
— Unverified 0Ensuring Safe and Smooth Control in Safety-Critical Systems via Filtered Control Barrier Functions Mar 30, 2025 continuous-control Continuous Control
— Unverified 0Zero-Shot LLMs in Human-in-the-Loop RL: Replacing Human Feedback for Reward Shaping Mar 26, 2025 continuous-control Continuous Control
Code Code Available 0Bootstrapped Model Predictive Control Mar 24, 2025 continuous-control Continuous Control
Code Code Available 1KEA: Keeping Exploration Alive by Proactively Coordinating Exploration Strategies Mar 23, 2025 continuous-control Continuous Control
— Unverified 0Learning with Expert Abstractions for Efficient Multi-Task Continuous Control Mar 19, 2025 continuous-control Continuous Control
Code Code Available 0VARP: Reinforcement Learning from Vision-Language Model Feedback with Agent Regularized Preferences Mar 18, 2025 continuous-control Continuous Control
— Unverified 0ConsisLoRA: Enhancing Content and Style Consistency for LoRA-based Style Transfer Mar 13, 2025 continuous-control Continuous Control
— Unverified 0Adaptive Anomaly Recovery for Telemanipulation: A Diffusion Model Approach to Vision-Based Tracking Mar 11, 2025 continuous-control Continuous Control
— Unverified 0Adversarial Policy Optimization for Offline Preference-based Reinforcement Learning Mar 7, 2025 continuous-control Continuous Control
Code Code Available 0Closing the Intent-to-Behavior Gap via Fulfillment Priority Logic Mar 4, 2025 continuous-control Continuous Control
— Unverified 0Improving Plasticity in Non-stationary Reinforcement Learning with Evidential Proximal Policy Optimization Mar 3, 2025 continuous-control Continuous Control
— Unverified 0Discrete Codebook World Models for Continuous Control Mar 1, 2025 continuous-control Continuous Control
Code Code Available 1Scalable Decision-Making in Stochastic Environments through Learned Temporal Abstraction Feb 28, 2025 continuous-control Continuous Control
Code Code Available 0Continuous Wrist Control on the Hannes Prosthesis: a Vision-based Shared Autonomy Framework Feb 24, 2025 continuous-control Continuous Control
— Unverified 0