Action Gaps and Advantages in Continuous-Time Distributional Reinforcement Learning Oct 14, 2024 Distributional Reinforcement Learning reinforcement-learning
— Unverified 0Burning RED: Unlocking Subtask-Driven Reinforcement Learning and Risk-Awareness in Average-Reward Markov Decision Processes Oct 14, 2024 Decision Making Decision Making Under Uncertainty
— Unverified 0DR-MPC: Deep Residual Model Predictive Control for Real-world Social Navigation Oct 14, 2024 Deep Reinforcement Learning Model Predictive Control
— Unverified 0Asymptotic Analysis of Sample-averaged Q-learning Oct 14, 2024 OpenAI Gym Q-Learning
— Unverified 0Large Language Model-Enhanced Reinforcement Learning for Generic Bus Holding Control Strategies Oct 14, 2024 In-Context Learning Language Modeling
— Unverified 0Integrating Reinforcement Learning and Large Language Models for Crop Production Process Management Optimization and Control through A New Knowledge-Based Deep Learning Paradigm Oct 13, 2024 Management Offline RL
— Unverified 0Improving Generalization on the ProcGen Benchmark with Simple Architectural Changes and Scale Oct 13, 2024 Deep Reinforcement Learning reinforcement-learning
Code Code Available 0Transformers as Game Players: Provable In-context Game-playing Capabilities of Pre-trained Models Oct 13, 2024 In-Context Learning Reinforcement Learning (RL)
— Unverified 0Meta-Reinforcement Learning with Universal Policy Adaptation: Provable Near-Optimality under All-task Optimum Comparator Oct 13, 2024 All Bilevel Optimization
— Unverified 0Generalization of Compositional Tasks with Logical Specification via Implicit Planning Oct 13, 2024 Graph Neural Network Reinforcement Learning (RL)
— Unverified 0ActSafe: Active Exploration with Safety Constraints for Reinforcement Learning Oct 12, 2024 Efficient Exploration reinforcement-learning
— Unverified 0Reinforcement Learning in Hyperbolic Spaces: Models and Experiments Oct 12, 2024 reinforcement-learning Reinforcement Learning
— Unverified 0SAPIENT: Mastering Multi-turn Conversational Recommendation with Strategic Planning and Monte Carlo Tree Search Oct 12, 2024 Conversational Recommendation Conversational Search
Code Code Available 0Physical Simulation for Multi-agent Multi-machine Tending Oct 11, 2024 Reinforcement Learning (RL)
— Unverified 0Enhancing Multi-Step Reasoning Abilities of Language Models through Direct Q-Function Optimization Oct 11, 2024 GSM8K Language Modeling
Code Code Available 2Overcoming Slow Decision Frequencies in Continuous Control: Model-Based Sequence Reinforcement Learning for Model-Free Control Oct 11, 2024 continuous-control Continuous Control
Code Code Available 0Reinforcement Learning for Control of Non-Markovian Cellular Population Dynamics Oct 11, 2024 reinforcement-learning Reinforcement Learning
Code Code Available 0SOLD: Slot Object-Centric Latent Dynamics Models for Relational Manipulation Learning from Pixels Oct 11, 2024 Model-based Reinforcement Learning reinforcement-learning
— Unverified 0MAD-TD: Model-Augmented Data stabilizes High Update Ratio RL Oct 11, 2024 Deep Reinforcement Learning Reinforcement Learning (RL)
— Unverified 0Can we hop in general? A discussion of benchmark selection and design using the Hopper environment Oct 11, 2024 Benchmarking Reinforcement Learning (RL)
— Unverified 0Drama: Mamba-Enabled Model-Based Reinforcement Learning Is Sample and Parameter Efficient Oct 11, 2024 Mamba Model-based Reinforcement Learning
Code Code Available 1Words as Beacons: Guiding RL Agents with High-Level Language Prompts Oct 11, 2024 Reinforcement Learning (RL)
— Unverified 0StablePrompt: Automatic Prompt Tuning using Reinforcement Learning for Large Language Models Oct 10, 2024 Question Answering Reinforcement Learning (RL)
Code Code Available 1Exploring Natural Language-Based Strategies for Efficient Number Learning in Children through Reinforcement Learning Oct 10, 2024 Deep Reinforcement Learning reinforcement-learning
Code Code Available 0Avoiding mode collapse in diffusion models fine-tuned with reinforcement learning Oct 10, 2024 Denoising Diversity
— Unverified 0VerifierQ: Enhancing LLM Test Time Compute with Q-Learning-based Verifiers Oct 10, 2024 Mathematical Reasoning Q-Learning
— Unverified 0Probabilistic Satisfaction of Temporal Logic Constraints in Reinforcement Learning via Adaptive Policy-Switching Oct 10, 2024 reinforcement-learning Reinforcement Learning
— Unverified 0Offline Hierarchical Reinforcement Learning via Inverse Optimization Oct 10, 2024 Decision Making Hierarchical Reinforcement Learning
— Unverified 0Masked Generative Priors Improve World Models Sequence Modelling Capabilities Oct 10, 2024 continuous-control Continuous Control
— Unverified 0Offline Inverse Constrained Reinforcement Learning for Safe-Critical Decision Making in Healthcare Oct 10, 2024 Common Sense Reasoning Data Augmentation
— Unverified 0Rewarding Progress: Scaling Automated Process Verifiers for LLM Reasoning Oct 10, 2024 Reinforcement Learning (RL)
— Unverified 0Efficient Reinforcement Learning with Large Language Model Priors Oct 10, 2024 Bayesian Inference Decision Making
— Unverified 0Crafting desirable climate trajectories with RL explored socio-environmental simulations Oct 9, 2024 Decision Making Decision Making Under Uncertainty
Code Code Available 0Zero-Shot Generalization of Vision-Based RL Without Data Augmentation Oct 9, 2024 Data Augmentation Disentanglement
— Unverified 0MotionRL: Align Text-to-Motion Generation to Human Preferences with Multi-Reward Reinforcement Learning Oct 9, 2024 Motion Generation reinforcement-learning
— Unverified 0Flipping-based Policy for Chance-Constrained Markov Decision Processes Oct 9, 2024 Reinforcement Learning (RL) Safe Reinforcement Learning
— Unverified 0Q-WSL: Optimizing Goal-Conditioned RL with Weighted Supervised Learning via Dynamic Programming Oct 9, 2024 Q-Learning Reinforcement Learning (RL)
— Unverified 0A Safety Modulator Actor-Critic Method in Model-Free Safe Reinforcement Learning and Application in UAV Hovering Oct 9, 2024 Reinforcement Learning (RL) Safe Reinforcement Learning
— Unverified 0Retrieval-Augmented Decision Transformer: External Memory for In-context RL Oct 9, 2024 In-Context Learning Reinforcement Learning (RL)
Code Code Available 1On the Modeling Capabilities of Large Language Models for Sequential Decision Making Oct 8, 2024 Decision Making Diversity
— Unverified 0Coevolving with the Other You: Fine-Tuning LLM with Sequential Cooperative Multi-Agent Reinforcement Learning Oct 8, 2024 GSM8K Multi-agent Reinforcement Learning
Code Code Available 1Solving robust MDPs as a sequence of static RL problems Oct 8, 2024 Reinforcement Learning (RL)
— Unverified 0Solving Multi-Goal Robotic Tasks with Decision Transformer Oct 8, 2024 Multi-Goal Reinforcement Learning reinforcement-learning
— Unverified 0Reinforcement Learning From Imperfect Corrective Actions And Proxy Rewards Oct 8, 2024 Atari Games Autonomous Driving
— Unverified 0Towards using Reinforcement Learning for Scaling and Data Replication in Cloud Systems Oct 7, 2024 Cloud Computing reinforcement-learning
— Unverified 0AlphaRouter: Quantum Circuit Routing with Reinforcement Learning and Tree Search Oct 7, 2024 reinforcement-learning Reinforcement Learning
— Unverified 0Towards Measuring Goal-Directedness in AI Systems Oct 7, 2024 Reinforcement Learning (RL)
— Unverified 0GreenLight-Gym: Reinforcement learning benchmark environment for control of greenhouse production systems Oct 6, 2024 Numerical Integration Reinforcement Learning (RL)
Code Code Available 1Data-driven Under Frequency Load Shedding Using Reinforcement Learning Oct 6, 2024 reinforcement-learning Reinforcement Learning
— Unverified 0Improved Off-policy Reinforcement Learning in Biological Sequence Design Oct 6, 2024 reinforcement-learning Reinforcement Learning
Code Code Available 0