| Policy-Driven World Model Adaptation for Robust Offline Model-based Reinforcement Learning | May 19, 2025 | D4RLmodel | —Unverified | 0 |
| Cache-Efficient Posterior Sampling for Reinforcement Learning with LLM-Derived Priors Across Discrete and Continuous Domains | May 12, 2025 | continuous-controlContinuous Control | —Unverified | 0 |
| Offline Multi-agent Reinforcement Learning via Score Decomposition | May 9, 2025 | continuous-controlContinuous Control | —Unverified | 0 |
| Model Tensor Planning | May 2, 2025 | modelModel Predictive Control | CodeCode Available | 1 |
| Directly Forecasting Belief for Reinforcement Learning with Delays | May 1, 2025 | D4RLMuJoCo | CodeCode Available | 0 |
| Variational OOD State Correction for Offline Reinforcement Learning | May 1, 2025 | Decision MakingMuJoCo | —Unverified | 0 |
| Text-to-Decision Agent: Learning Generalist Policies from Natural Language Supervision | Apr 21, 2025 | MuJoCoZero-shot Generalization | —Unverified | 0 |
| Learning Transferable Friction Models and LuGre Identification via Physics Informed Neural Networks | Apr 16, 2025 | Computational EfficiencyFriction | —Unverified | 0 |
| Adapting World Models with Latent-State Dynamics Residuals | Apr 3, 2025 | MuJoCoReinforcement Learning (RL) | —Unverified | 0 |
| Beyond Non-Expert Demonstrations: Outcome-Driven Action Constraint for Offline Reinforcement Learning | Apr 2, 2025 | MuJoCoUncertainty Quantification | —Unverified | 0 |
| Handling Delay in Real-Time Reinforcement Learning | Mar 30, 2025 | MuJoCoreinforcement-learning | CodeCode Available | 0 |
| Learning Generalizable Skills from Offline Multi-Task Data for Multi-Agent Cooperation | Mar 27, 2025 | MuJoCoSMAC | CodeCode Available | 0 |
| Adventurer: Exploration with BiGAN for Deep Reinforcement Learning | Mar 24, 2025 | Atari GamesDeep Reinforcement Learning | —Unverified | 0 |
| CAE: Repurposing the Critic as an Explorer in Deep Reinforcement Learning | Mar 23, 2025 | Deep Reinforcement LearningEfficient Exploration | —Unverified | 0 |
| Likelihood Reward Redistribution | Mar 20, 2025 | MuJoCo | —Unverified | 0 |
| Application of linear regression method to the deep reinforcement learning in continuous action cases | Mar 19, 2025 | Deep Reinforcement LearningMuJoCo | CodeCode Available | 0 |
| Residual Policy Gradient: A Reward View of KL-regularized Objective | Mar 14, 2025 | Imitation LearningMuJoCo | —Unverified | 0 |
| An Real-Sim-Real (RSR) Loop Framework for Generalizable Robotic Policy Transfer with Differentiable Simulation | Mar 13, 2025 | MuJoCo | CodeCode Available | 1 |
| AVG-DICE: Stationary Distribution Correction by Regression | Mar 3, 2025 | AvgMuJoCo | —Unverified | 0 |
| SrSv: Integrating Sequential Rollouts with Sequential Value Estimation for Multi-agent Reinforcement Learning | Mar 3, 2025 | MuJoCoMulti-agent Reinforcement Learning | —Unverified | 0 |
| IL-SOAR : Imitation Learning with Soft Optimistic Actor cRitic | Feb 27, 2025 | Imitation LearningMuJoCo | —Unverified | 0 |
| Offline Reinforcement Learning via Inverse Optimization | Feb 27, 2025 | Model Predictive ControlMuJoCo | CodeCode Available | 0 |
| RIZE: Regularized Imitation Learning via Distributional Reinforcement Learning | Feb 27, 2025 | Distributional Reinforcement LearningImitation Learning | CodeCode Available | 0 |
| Yes, Q-learning Helps Offline In-Context RL | Feb 24, 2025 | In-Context Reinforcement LearningMuJoCo | —Unverified | 0 |
| PMAT: Optimizing Action Generation Order in Multi-Agent Reinforcement Learning | Feb 23, 2025 | Action GenerationDecision Making | CodeCode Available | 0 |