| DRDT3: Diffusion-Refined Decision Test-Time Training Model | Jan 12, 2025 | D4RLOffline RL | —Unverified | 0 |
| SR-Reward: Taking The Path More Traveled | Jan 4, 2025 | D4RLImitation Learning | —Unverified | 0 |
| On the Statistical Complexity for Offline and Low-Adaptive Reinforcement Learning with Structures | Jan 3, 2025 | Offline RLReinforcement Learning (RL) | —Unverified | 0 |
| Goal-Conditioned Data Augmentation for Offline Reinforcement Learning | Dec 29, 2024 | D4RLData Augmentation | —Unverified | 0 |
| Constraint-Adaptive Policy Switching for Offline Safe Reinforcement Learning | Dec 25, 2024 | Decision MakingOffline RL | CodeCode Available | 1 |
| Optimistic Critic Reconstruction and Constrained Fine-Tuning for General Offline-to-Online RL | Dec 25, 2024 | Offline RLReinforcement Learning (RL) | CodeCode Available | 0 |
| Improving Multi-Step Reasoning Abilities of Large Language Models with Direct Advantage Policy Optimization | Dec 24, 2024 | Offline RLReinforcement Learning (RL) | —Unverified | 0 |
| Offline Reinforcement Learning for LLM Multi-Step Reasoning | Dec 20, 2024 | GSM8KMath | CodeCode Available | 2 |
| AdaCred: Adaptive Causal Decision Transformers with Feature Crediting | Dec 19, 2024 | AttributeImitation Learning | —Unverified | 0 |
| Are Expressive Models Truly Necessary for Offline RL? | Dec 15, 2024 | D4RLOffline RL | CodeCode Available | 1 |
| In-Dataset Trajectory Return Regularization for Offline Preference-based Reinforcement Learning | Dec 12, 2024 | Offline RL | CodeCode Available | 1 |
| Latent Safety-Constrained Policy Approach for Safe Offline Reinforcement Learning | Dec 11, 2024 | Autonomous DrivingOffline RL | CodeCode Available | 0 |
| Efficient Online Reinforcement Learning Fine-Tuning Need Not Retain Offline Data | Dec 10, 2024 | Offline RLReinforcement Learning (RL) | CodeCode Available | 2 |
| Policy Agnostic RL: Offline RL and Online RL Fine-Tuning of Any Class and Backbone | Dec 9, 2024 | global-optimizationImitation Learning | —Unverified | 0 |
| Reinforcement Learning: An Overview | Dec 6, 2024 | Decision MakingDeep Reinforcement Learning | —Unverified | 0 |
| Finer Behavioral Foundation Models via Auto-Regressive Features and Advantage Weighting | Dec 5, 2024 | D4RLOffline RL | —Unverified | 0 |
| Improving Dynamic Object Interactions in Text-to-Video Generation with AI Feedback | Dec 3, 2024 | ObjectOffline RL | —Unverified | 0 |
| Revisiting Generative Policies: A Simpler Reinforcement Learning Algorithmic Perspective | Dec 2, 2024 | Density EstimationOffline RL | CodeCode Available | 2 |
| Robust Offline Reinforcement Learning with Linearly Structured f-Divergence Regularization | Nov 27, 2024 | Computational EfficiencyOffline RL | —Unverified | 0 |
| PROGRESSOR: A Perceptually Guided Reward Estimator with Self-Supervised Online Refinement | Nov 26, 2024 | Offline RLReinforcement Learning (RL) | —Unverified | 0 |
| Pretrained LLM Adapted with LoRA as a Decision Transformer for Offline RL in Quantitative Trading | Nov 26, 2024 | Offline RLparameter-efficient fine-tuning | CodeCode Available | 2 |
| LLM-Based Offline Learning for Embodied Agents via Consistency-Guided Reward Ensemble | Nov 26, 2024 | Offline RLReinforcement Learning (RL) | —Unverified | 0 |
| Preserving Expert-Level Privacy in Offline Reinforcement Learning | Nov 18, 2024 | Offline RLreinforcement-learning | —Unverified | 0 |
| Continual Task Learning through Adaptive Policy Self-Composition | Nov 18, 2024 | Continual LearningOffline RL | CodeCode Available | 0 |
| Doubly Mild Generalization for Offline Reinforcement Learning | Nov 12, 2024 | MuJoCoOffline RL | CodeCode Available | 1 |