Rewarding Progress: Scaling Automated Process Verifiers for LLM Reasoning Oct 10, 2024 Reinforcement Learning (RL)
— Unverified 0Avoiding mode collapse in diffusion models fine-tuned with reinforcement learning Oct 10, 2024 Denoising Diversity
— Unverified 0Masked Generative Priors Improve World Models Sequence Modelling Capabilities Oct 10, 2024 continuous-control Continuous Control
— Unverified 0Efficient Reinforcement Learning with Large Language Model Priors Oct 10, 2024 Bayesian Inference Decision Making
— Unverified 0Offline Inverse Constrained Reinforcement Learning for Safe-Critical Decision Making in Healthcare Oct 10, 2024 Common Sense Reasoning Data Augmentation
— Unverified 0Exploring Natural Language-Based Strategies for Efficient Number Learning in Children through Reinforcement Learning Oct 10, 2024 Deep Reinforcement Learning reinforcement-learning
Code Code Available 0Offline Hierarchical Reinforcement Learning via Inverse Optimization Oct 10, 2024 Decision Making Hierarchical Reinforcement Learning
— Unverified 0VerifierQ: Enhancing LLM Test Time Compute with Q-Learning-based Verifiers Oct 10, 2024 Mathematical Reasoning Q-Learning
— Unverified 0Probabilistic Satisfaction of Temporal Logic Constraints in Reinforcement Learning via Adaptive Policy-Switching Oct 10, 2024 reinforcement-learning Reinforcement Learning
— Unverified 0Zero-Shot Generalization of Vision-Based RL Without Data Augmentation Oct 9, 2024 Data Augmentation Disentanglement
— Unverified 0Crafting desirable climate trajectories with RL explored socio-environmental simulations Oct 9, 2024 Decision Making Decision Making Under Uncertainty
Code Code Available 0A Safety Modulator Actor-Critic Method in Model-Free Safe Reinforcement Learning and Application in UAV Hovering Oct 9, 2024 Reinforcement Learning (RL) Safe Reinforcement Learning
— Unverified 0MotionRL: Align Text-to-Motion Generation to Human Preferences with Multi-Reward Reinforcement Learning Oct 9, 2024 Motion Generation reinforcement-learning
— Unverified 0Flipping-based Policy for Chance-Constrained Markov Decision Processes Oct 9, 2024 Reinforcement Learning (RL) Safe Reinforcement Learning
— Unverified 0Q-WSL: Optimizing Goal-Conditioned RL with Weighted Supervised Learning via Dynamic Programming Oct 9, 2024 Q-Learning Reinforcement Learning (RL)
— Unverified 0Solving robust MDPs as a sequence of static RL problems Oct 8, 2024 Reinforcement Learning (RL)
— Unverified 0Reinforcement Learning From Imperfect Corrective Actions And Proxy Rewards Oct 8, 2024 Atari Games Autonomous Driving
— Unverified 0Solving Multi-Goal Robotic Tasks with Decision Transformer Oct 8, 2024 Multi-Goal Reinforcement Learning reinforcement-learning
— Unverified 0On the Modeling Capabilities of Large Language Models for Sequential Decision Making Oct 8, 2024 Decision Making Diversity
— Unverified 0AlphaRouter: Quantum Circuit Routing with Reinforcement Learning and Tree Search Oct 7, 2024 reinforcement-learning Reinforcement Learning
— Unverified 0Towards Measuring Goal-Directedness in AI Systems Oct 7, 2024 Reinforcement Learning (RL)
— Unverified 0Towards using Reinforcement Learning for Scaling and Data Replication in Cloud Systems Oct 7, 2024 Cloud Computing reinforcement-learning
— Unverified 0DeepLTL: Learning to Efficiently Satisfy Complex LTL Specifications for Multi-Task RL Oct 6, 2024 Reinforcement Learning (RL)
— Unverified 0Data-driven Under Frequency Load Shedding Using Reinforcement Learning Oct 6, 2024 reinforcement-learning Reinforcement Learning
— Unverified 0AdaMemento: Adaptive Memory-Assisted Policy Optimization for Reinforcement Learning Oct 6, 2024 Ensemble Learning reinforcement-learning
— Unverified 0Improved Off-policy Reinforcement Learning in Biological Sequence Design Oct 6, 2024 reinforcement-learning Reinforcement Learning
Code Code Available 0A Reinforcement Learning Engine with Reduced Action and State Space for Scalable Cyber-Physical Optimal Response Oct 6, 2024 Reinforcement Learning (RL)
— Unverified 0Improving Portfolio Optimization Results with Bandit Networks Oct 5, 2024 Portfolio Optimization Recommendation Systems
Code Code Available 0Spatial-aware decision-making with ring attractors in reinforcement learning systems Oct 4, 2024 Decision Making Reinforcement Learning (RL)
— Unverified 0Solving Reach-Avoid-Stay Problems Using Deep Deterministic Policy Gradients Oct 3, 2024 Reinforcement Learning (RL)
— Unverified 0Cross-Embodiment Dexterous Grasping with Reinforcement Learning Oct 3, 2024 reinforcement-learning Reinforcement Learning
— Unverified 0Learning Emergence of Interaction Patterns across Independent RL Agents in Multi-Agent Environments Oct 3, 2024 Multi-agent Reinforcement Learning Reinforcement Learning (RL)
— Unverified 0Efficient Residual Learning with Mixture-of-Experts for Universal Dexterous Grasping Oct 3, 2024 GPU Mixture-of-Experts
— Unverified 0End-to-end Driving in High-Interaction Traffic Scenarios with Reinforcement Learning Oct 3, 2024 Autonomous Driving CARLA Leaderboard 2.0
— Unverified 0Dual Active Learning for Reinforcement Learning from Human Feedback Oct 3, 2024 Active Learning reinforcement-learning
— Unverified 0Beyond Expected Returns: A Policy Gradient Algorithm for Cumulative Prospect Theoretic Reinforcement Learning Oct 3, 2024 Reinforcement Learning (RL)
— Unverified 0Adaptive teachers for amortized samplers Oct 2, 2024 Decision Making Efficient Exploration
Code Code Available 0The Smart Buildings Control Suite: A Diverse Open Source Benchmark to Evaluate and Scale HVAC Control Policies for Sustainability Oct 2, 2024 Model Predictive Control Offline RL
— Unverified 0Scalable Reinforcement Learning-based Neural Architecture Search Oct 2, 2024 Neural Architecture Search reinforcement-learning
— Unverified 0Sparse Autoencoders Reveal Temporal Difference Learning in Large Language Models Oct 2, 2024 In-Context Learning Reinforcement Learning (RL)
— Unverified 0Don't flatten, tokenize! Unlocking the key to SoftMoE's efficacy in deep RL Oct 2, 2024 Reinforcement Learning (RL)
— Unverified 0LLM-Augmented Symbolic Reinforcement Learning with Landmark-Based Task Decomposition Oct 2, 2024 Common Sense Reasoning Inductive logic programming
— Unverified 0ComaDICE: Offline Cooperative Multi-Agent Reinforcement Learning with Stationary Distribution Shift Regularization Oct 2, 2024 MuJoCo Multi-agent Reinforcement Learning
— Unverified 0Sampling from Energy-based Policies using Diffusion Oct 2, 2024 continuous-control Continuous Control
— Unverified 0PreND: Enhancing Intrinsic Motivation in Reinforcement Learning through Pre-trained Network Distillation Oct 2, 2024 Developmental Learning reinforcement-learning
— Unverified 0Absolute State-wise Constrained Policy Optimization: High-Probability State-wise Constraints Satisfaction Oct 2, 2024 Autonomous Driving continuous-control
— Unverified 0Bellman Diffusion: Generative Modeling as Learning a Linear Operator in the Distribution Space Oct 2, 2024 Decision Making Distributional Reinforcement Learning
— Unverified 0Upper and Lower Bounds for Distributionally Robust Off-Dynamics Reinforcement Learning Sep 30, 2024 2k Computational Efficiency
— Unverified 0Task-agnostic Pre-training and Task-guided Fine-tuning for Versatile Diffusion Planner Sep 30, 2024 Reinforcement Learning (RL)
— Unverified 0Personalisation via Dynamic Policy Fusion Sep 30, 2024 Deep Reinforcement Learning Reinforcement Learning (RL)
— Unverified 0