| Can language agents be alternatives to PPO? A Preliminary Empirical Study On OpenAI Gym | Dec 6, 2023 | BenchmarkingDecision Making | CodeCode Available | 1 |
| Large Language Model as a Policy Teacher for Training Reinforcement Learning Agents | Nov 22, 2023 | Decision MakingLanguage Modeling | CodeCode Available | 1 |
| Free from Bellman Completeness: Trajectory Stitching via Model-based Return-conditioned Supervised Learning | Oct 30, 2023 | Decision MakingOffline RL | CodeCode Available | 1 |
| Out of the Cage: How Stochastic Parrots Win in Cyber Security Environments | Aug 23, 2023 | CyberBattleSimCyberBattleSim (RL) chain scenario | CodeCode Available | 1 |
| Breadcrumbs to the Goal: Goal-Conditioned Exploration from Human-in-the-Loop Feedback | Jul 20, 2023 | Decision Makingreinforcement-learning | CodeCode Available | 1 |
| ContainerGym: A Real-World Reinforcement Learning Benchmark for Resource Allocation | Jul 6, 2023 | Decision MakingDeep Reinforcement Learning | CodeCode Available | 1 |
| Sampling from Gaussian Process Posteriors using Stochastic Gradient Descent | Jun 20, 2023 | Bayesian OptimizationDecision Making | CodeCode Available | 1 |
| Simplified Temporal Consistency Reinforcement Learning | Jun 15, 2023 | Decision Makingreinforcement-learning | CodeCode Available | 1 |
| Decision Stacks: Flexible Reinforcement Learning via Modular Generative Models | Jun 9, 2023 | Decision Makingreinforcement-learning | CodeCode Available | 1 |
| Enabling Intelligent Interactions between an Agent and an LLM: A Reinforcement Learning Approach | Jun 6, 2023 | Decision MakingSequential Decision Making | CodeCode Available | 1 |