Bilevel reinforcement learning via the development of hyper-gradient without lower-level convexity May 30, 2024 Bilevel Optimization reinforcement-learning
— Unverified 0Learning to Discuss Strategically: A Case Study on One Night Ultimate Werewolf May 30, 2024 Reinforcement Learning (RL)
— Unverified 0RLeXplore: Accelerating Research in Intrinsically-Motivated Reinforcement Learning May 29, 2024 reinforcement-learning Reinforcement Learning
— Unverified 0Safety through Permissibility: Shield Construction for Fast and Safe Reinforcement Learning May 29, 2024 Reinforcement Learning (RL) Safe Reinforcement Learning
— Unverified 0Policy Zooming: Adaptive Discretization-based Infinite-Horizon Average-Reward Reinforcement Learning May 29, 2024 reinforcement-learning Reinforcement Learning (RL)
— Unverified 0Kernel Metric Learning for In-Sample Off-Policy Evaluation of Deterministic RL Policies May 29, 2024 Metric Learning Off-policy evaluation
Code Code Available 0Preferred-Action-Optimized Diffusion Policies for Offline Reinforcement Learning May 29, 2024 Offline RL reinforcement-learning
— Unverified 0Value-Incentivized Preference Optimization: A Unified Approach to Online and Offline RLHF May 29, 2024 reinforcement-learning Reinforcement Learning
— Unverified 0A Study of Plasticity Loss in On-Policy Deep Reinforcement Learning May 29, 2024 Continual Learning Deep Reinforcement Learning
Code Code Available 0Imitating from auxiliary imperfect demonstrations via Adversarial Density Weighted Regression May 28, 2024 Imitation Learning MuJoCo
Code Code Available 0LeDex: Training LLMs to Better Self-Debug and Explain Code May 28, 2024 Code Generation Reinforcement Learning (RL)
— Unverified 0Offline-Boosted Actor-Critic: Adaptively Blending Optimal Historical Behaviors in Deep Off-Policy RL May 28, 2024 Offline RL Reinforcement Learning (RL)
Code Code Available 1DTR-Bench: An in silico Environment and Benchmark Platform for Reinforcement Learning Based Dynamic Treatment Regime May 28, 2024 Benchmarking Reinforcement Learning (RL)
Code Code Available 1Reinforcement Learning in Dynamic Treatment Regimes Needs Critical Reexamination May 28, 2024 Offline RL reinforcement-learning
Code Code Available 1Rethinking Pruning for Backdoor Mitigation: An Optimization Perspective May 28, 2024 backdoor defense Graph Neural Network
— Unverified 0Getting More Juice Out of the SFT Data: Reward Learning from Human Demonstration Improves SFT for LLM Alignment May 28, 2024 reinforcement-learning Reinforcement Learning
Code Code Available 0Mollification Effects of Policy Gradient Methods May 28, 2024 continuous-control Continuous Control
— Unverified 0Highway Reinforcement Learning May 28, 2024 Q-Learning reinforcement-learning
— Unverified 0Safe Reinforcement Learning in Black-Box Environments via Adaptive Shielding May 28, 2024 reinforcement-learning Reinforcement Learning (RL)
Code Code Available 0Large Language Model-Driven Curriculum Design for Mobile Networks May 28, 2024 Language Modeling Language Modelling
Code Code Available 0Extreme Value Monte Carlo Tree Search May 28, 2024 Board Games Reinforcement Learning (RL)
— Unverified 0Symmetric Reinforcement Learning Loss for Robust Learning on Diverse Tasks and Model Scales May 27, 2024 Atari Games MuJoCo
Code Code Available 0Ontology-Enhanced Decision-Making for Autonomous Agents in Dynamic and Partially Observable Environments May 27, 2024 Decision Making Reinforcement Learning (RL)
— Unverified 0Trajectory Data Suffices for Statistically Efficient Learning in Offline RL with Linear q^π-Realizability and Concentrability May 27, 2024 Computational Efficiency Offline RL
— Unverified 0Q-value Regularized Transformer for Offline Reinforcement Learning May 27, 2024 D4RL Offline RL
Code Code Available 1Surprise-Adaptive Intrinsic Motivation for Unsupervised Reinforcement Learning May 27, 2024 reinforcement-learning Reinforcement Learning
Code Code Available 0DPN: Decoupling Partition and Navigation for Neural Solvers of Min-max Vehicle Routing Problems May 27, 2024 Reinforcement Learning (RL)
Code Code Available 1Structured Graph Network for Constrained Robot Crowd Navigation with Low Fidelity Simulation May 27, 2024 Reinforcement Learning (RL)
— Unverified 0Rethinking Transformers in Solving POMDPs May 27, 2024 Decision Making Reinforcement Learning (RL)
Code Code Available 1Biological Neurons Compete with Deep Reinforcement Learning in Sample Efficiency in a Simulated Gameworld May 27, 2024 Deep Reinforcement Learning reinforcement-learning
— Unverified 0Oracle-Efficient Reinforcement Learning for Max Value Ensembles May 27, 2024 reinforcement-learning Reinforcement Learning
— Unverified 0Triple Preference Optimization: Achieving Better Alignment with Less Data in a Single Step Optimization May 26, 2024 Reinforcement Learning (RL)
Code Code Available 1Reinforcement Learning for Jump-Diffusions, with Financial Applications May 26, 2024 Q-Learning reinforcement-learning
— Unverified 0Competing for pixels: a self-play algorithm for weakly-supervised segmentation May 26, 2024 Binary Classification Image Segmentation
Code Code Available 0Safe and Balanced: A Framework for Constrained Multi-Objective Reinforcement Learning May 26, 2024 Multi-Objective Reinforcement Learning reinforcement-learning
Code Code Available 0Fast TRAC: A Parameter-Free Optimizer for Lifelong Reinforcement Learning May 26, 2024 reinforcement-learning Reinforcement Learning
— Unverified 0An Evolutionary Framework for Connect-4 as Test-Bed for Comparison of Advanced Minimax, Q-Learning and MCTS May 26, 2024 Decision Making Q-Learning
— Unverified 0Constrained Ensemble Exploration for Unsupervised Skill Discovery May 25, 2024 Reinforcement Learning (RL) Unsupervised Reinforcement Learning
— Unverified 0Bigger, Regularized, Optimistic: scaling for compute and sample-efficient continuous control May 25, 2024 continuous-control Continuous Control
Code Code Available 2Diffusion-based Reinforcement Learning via Q-weighted Variational Policy Optimization May 25, 2024 continuous-control Continuous Control
Code Code Available 2Adaptive Q-Network: On-the-fly Target Selection for Deep Reinforcement Learning May 25, 2024 Atari Games AutoML
— Unverified 0AIGB: Generative Auto-bidding via Conditional Diffusion Modeling May 25, 2024 Reinforcement Learning (RL)
— Unverified 0Embedding-Aligned Language Models May 24, 2024 Reinforcement Learning (RL) Text Generation
— Unverified 0SF-DQN: Provable Knowledge Transfer using Successor Feature for Deep Reinforcement Learning May 24, 2024 Deep Reinforcement Learning Q-Learning
— Unverified 0Knowledge-Informed Auto-Penetration Testing Based on Reinforcement Learning with Reward Machine May 24, 2024 Q-Learning Reinforcement Learning (RL)
— Unverified 0Diffusion Actor-Critic with Entropy Regulator May 24, 2024 Decision Making MuJoCo
Code Code Available 2Extracting Heuristics from Large Language Models for Reward Shaping in Reinforcement Learning May 24, 2024 Language Modelling Large Language Model
— Unverified 0Model-free reinforcement learning with noisy actions for automated experimental control in optics May 24, 2024 Reinforcement Learning (RL)
Code Code Available 0Human-in-the-loop Reinforcement Learning for Data Quality Monitoring in Particle Physics Experiments May 24, 2024 Data Augmentation Reinforcement Learning (RL)
— Unverified 0Cooperative Backdoor Attack in Decentralized Reinforcement Learning with Theoretical Guarantee May 24, 2024 Backdoor Attack reinforcement-learning
— Unverified 0