| Is Value Functions Estimation with Classification Plug-and-play for Offline Reinforcement Learning? | Jun 10, 2024 | Deep Reinforcement LearningOffline RL | CodeCode Available | 0 |
| Decision Mamba: A Multi-Grained State Space Model with Self-Evolution Regularization for Offline RL | Jun 8, 2024 | Data AugmentationMamba | CodeCode Available | 0 |
| Stabilizing Extreme Q-learning by Maclaurin Expansion | Jun 7, 2024 | D4RLOffline RL | CodeCode Available | 0 |
| Self-Play with Adversarial Critic: Provable and Scalable Offline Alignment for Language Models | Jun 6, 2024 | Offline RLreinforcement-learning | —Unverified | 0 |
| UDQL: Bridging The Gap between MSE Loss and The Optimal Value Function in Offline Reinforcement Learning | Jun 5, 2024 | D4RLOffline RL | —Unverified | 0 |
| A Fast Convergence Theory for Offline Decision Making | Jun 3, 2024 | Decision MakingOffline RL | —Unverified | 0 |
| Causal prompting model-based offline reinforcement learning | Jun 3, 2024 | modelOffline RL | —Unverified | 0 |
| Preferred-Action-Optimized Diffusion Policies for Offline Reinforcement Learning | May 29, 2024 | Offline RLreinforcement-learning | —Unverified | 0 |
| Inverse Concave-Utility Reinforcement Learning is Inverse Game Theory | May 29, 2024 | Imitation LearningOffline RL | —Unverified | 0 |
| AlignIQL: Policy Alignment in Implicit Q-Learning through Constrained Optimization | May 28, 2024 | D4RLOffline RL | CodeCode Available | 0 |
| Unified Preference Optimization: Language Model Alignment Beyond the Preference Frontier | May 28, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Trajectory Data Suffices for Statistically Efficient Learning in Offline RL with Linear q^π-Realizability and Concentrability | May 27, 2024 | Computational EfficiencyOffline RL | —Unverified | 0 |
| OPERA: Automatic Offline Policy Evaluation with Re-weighted Aggregates of Multiple Estimators | May 27, 2024 | Decision MakingOffline RL | —Unverified | 0 |
| Exclusively Penalized Q-learning for Offline Reinforcement Learning | May 23, 2024 | Offline RLQ-Learning | —Unverified | 0 |
| Offline Reinforcement Learning from Datasets with Structured Non-Stationarity | May 23, 2024 | continuous-controlContinuous Control | CodeCode Available | 0 |
| Offline RL via Feature-Occupancy Gradient Ascent | May 22, 2024 | Offline RL | —Unverified | 0 |
| Efficient Imitation Learning with Conservative World Models | May 21, 2024 | Imitation LearningOffline RL | —Unverified | 0 |
| Is Mamba Compatible with Trajectory Optimization in Offline Reinforcement Learning? | May 20, 2024 | Atari GamesMamba | CodeCode Available | 0 |
| Towards Robust Policy: Enhancing Offline Reinforcement Learning with Adversarial Attacks and Defenses | May 18, 2024 | D4RLOffline RL | —Unverified | 0 |
| Ensemble Successor Representations for Task Generalization in Offline-to-Online Reinforcement Learning | May 12, 2024 | Offline RLReinforcement Learning (RL) | —Unverified | 0 |
| Improving Offline Reinforcement Learning with Inaccurate Simulators | May 7, 2024 | D4RLGenerative Adversarial Network | —Unverified | 0 |
| Robot Air Hockey: A Manipulation Testbed for Robot Learning with Reinforcement Learning | May 6, 2024 | Offline RL | —Unverified | 0 |
| Out-of-Distribution Adaptation in Offline RL: Counterfactual Reasoning via Causal Normalizing Flows | May 6, 2024 | Causal Inferencecounterfactual | —Unverified | 0 |
| Generalize by Touching: Tactile Ensemble Skill Transfer for Robotic Furniture Assembly | Apr 26, 2024 | Contact-rich ManipulationOffline RL | —Unverified | 0 |
| Offline Reinforcement Learning with Behavioral Supervisor Tuning | Apr 25, 2024 | Offline RLreinforcement-learning | —Unverified | 0 |