Reward Gaming in Conditional Text Generation Nov 16, 2022 Conditional Text Generation Reinforcement Learning (RL)
— Unverified 0Task Aware Dreamer for Task Generalization in Reinforcement Learning Mar 9, 2023 reinforcement-learning Reinforcement Learning
— Unverified 0Rewarding Doubt: A Reinforcement Learning Approach to Confidence Calibration of Large Language Models Mar 4, 2025 Reinforcement Learning (RL)
— Unverified 0Rewarding Episodic Visitation Discrepancy for Exploration in Reinforcement Learning Sep 19, 2022 Atari Games Benchmarking
— Unverified 0Rewarding Progress: Scaling Automated Process Verifiers for LLM Reasoning Oct 10, 2024 Reinforcement Learning (RL)
— Unverified 0Rewarding Semantic Similarity under Optimized Alignments for AMR-to-Text Generation May 1, 2022 AMR-to-Text Generation Reinforcement Learning (RL)
— Unverified 0Rewarding Smatch: Transition-Based AMR Parsing with Reinforcement Learning May 31, 2019 AMR Parsing reinforcement-learning
— Unverified 0Rewarding What Matters: Step-by-Step Reinforcement Learning for Task-Oriented Dialogue Jun 20, 2024 Dialogue State Tracking Reinforcement Learning (RL)
— Unverified 0Reward is enough for convex MDPs Jun 1, 2021 Reinforcement Learning (RL)
— Unverified 0Reward Is Enough: LLMs Are In-Context Reinforcement Learners May 21, 2025 Large Language Model Reinforcement Learning (RL)
— Unverified 0Reward is not enough: can we liberate AI from the reinforcement learning paradigm? Feb 3, 2022 reinforcement-learning Reinforcement Learning (RL)
— Unverified 0Reward Learning from Suboptimal Demonstrations with Applications in Surgical Electrocautery Apr 10, 2024 Decision Making Imitation Learning
— Unverified 0Reward Learning using Structural Motifs in Inverse Reinforcement Learning Sep 25, 2022 reinforcement-learning Reinforcement Learning
— Unverified 0Rewardless Open-Ended Learning (ROEL) Sep 29, 2021 reinforcement-learning Reinforcement Learning
— Unverified 0Reward Machine Inference for Robotic Manipulation Dec 13, 2024 Reinforcement Learning (RL)
— Unverified 0Reward (Mis)design for Autonomous Driving Apr 28, 2021 Autonomous Driving reinforcement-learning
— Unverified 0Reward Poisoning Attacks on Offline Multi-Agent Reinforcement Learning Jun 4, 2022 Multi-agent Reinforcement Learning reinforcement-learning
— Unverified 0Reward Poisoning in Reinforcement Learning: Attacks Against Unknown Learners in Unknown Environments Feb 16, 2021 reinforcement-learning Reinforcement Learning (RL)
— Unverified 0Reward prediction for representation learning and reward shaping May 7, 2021 Prediction Reinforcement Learning (RL)
— Unverified 0Reward-Predictive Clustering Nov 7, 2022 Clustering reinforcement-learning
— Unverified 0STIR^2: Reward Relabelling for combined Reinforcement and Imitation Learning on sparse-reward tasks Jan 11, 2022 Autonomous Driving Decision Making
— Unverified 0Reward-Respecting Subtasks for Model-Based Reinforcement Learning Feb 7, 2022 Model-based Reinforcement Learning reinforcement-learning
— Unverified 0Rewards Encoding Environment Dynamics Improves Preference-based Reinforcement Learning Nov 12, 2022 reinforcement-learning Reinforcement Learning
— Unverified 0Reward Shaping for Reinforcement Learning with Omega-Regular Objectives Jan 16, 2020 reinforcement-learning Reinforcement Learning
— Unverified 0Reward Shaping for User Satisfaction in a REINFORCE Recommender Sep 30, 2022 Imputation Reinforcement Learning (RL)
— Unverified 0Reward Shaping via Diffusion Process in Reinforcement Learning Jun 20, 2023 Navigate reinforcement-learning
— Unverified 0Reward Shaping via Meta-Learning Jan 27, 2019 Meta-Learning Reinforcement Learning
— Unverified 0Reward Shaping with Dynamic Trajectory Aggregation Apr 13, 2021 reinforcement-learning Reinforcement Learning (RL)
— Unverified 0Reward Shaping with Subgoals for Social Navigation Apr 13, 2021 reinforcement-learning Reinforcement Learning
— Unverified 0RewardsOfSum: Exploring Reinforcement Learning Rewards for Summarisation Jun 8, 2021 reinforcement-learning Reinforcement Learning
— Unverified 0Rewards with Negative Examples for Reinforced Topic-Focused Abstractive Summarization Nov 1, 2021 Abstractive Text Summarization Deep Reinforcement Learning
— Unverified 0Reward Tampering Problems and Solutions in Reinforcement Learning: A Causal Influence Diagram Perspective Aug 13, 2019 reinforcement-learning Reinforcement Learning
— Unverified 0Reward Training Wheels: Adaptive Auxiliary Rewards for Robotics Reinforcement Learning Mar 19, 2025 Reinforcement Learning (RL)
— Unverified 0REX: Rapid Exploration and eXploitation for AI Agents Jul 18, 2023 AI Agent Decision Making
— Unverified 0ReZero: Enhancing LLM search ability by trying one-more-time Apr 15, 2025 Language Modeling Language Modelling
— Unverified 0RIDM: Reinforced Inverse Dynamics Modeling for Learning from a Single Observed Demonstration Jun 18, 2019 Imitation Learning reinforcement-learning
— Unverified 0Riemannian Stochastic Gradient Method for Nested Composition Optimization Jul 19, 2022 Meta-Learning reinforcement-learning
— Unverified 0RILe: Reinforced Imitation Learning Jun 12, 2024 Computational Efficiency Imitation Learning
— Unverified 0Ring-lite: Scalable Reasoning via C3PO-Stabilized Reinforcement Learning for LLMs Jun 17, 2025 Data Integration Large Language Model
— Unverified 0RIS-assisted UAV Communications for IoT with Wireless Power Transfer Using Deep Reinforcement Learning Aug 5, 2021 Deep Reinforcement Learning Reinforcement Learning (RL)
— Unverified 0RISCLESS: A Reinforcement Learning Strategy to Exploit Unused Cloud Resources Apr 28, 2022 reinforcement-learning Reinforcement Learning
— Unverified 0Risk-Averse Bayes-Adaptive Reinforcement Learning Feb 10, 2021 Bayesian Optimisation reinforcement-learning
— Unverified 0Risk-Averse Learning by Temporal Difference Methods Mar 2, 2020 reinforcement-learning Reinforcement Learning
— Unverified 0Risk-averse policies for natural gas futures trading using distributional reinforcement learning Jan 8, 2025 Distributional Reinforcement Learning energy trading
— Unverified 0Risk-Averse Reinforcement Learning via Dynamic Time-Consistent Risk Measures Jan 14, 2023 Q-Learning reinforcement-learning
— Unverified 0Risk Averse Robust Adversarial Reinforcement Learning Mar 31, 2019 Deep Reinforcement Learning reinforcement-learning
— Unverified 0Risk Averse Value Expansion for Sample Efficient and Robust Policy Learning Sep 25, 2019 Model-based Reinforcement Learning MuJoCo
— Unverified 0Risk Aware and Multi-Objective Decision Making with Distributional Monte Carlo Tree Search Feb 1, 2021 Decision Making Multi-Objective Reinforcement Learning
— Unverified 0Risk-Aware Reinforcement Learning through Optimal Transport Theory Sep 12, 2023 Decision Making Management
— Unverified 0Risk-Aware Safe Reinforcement Learning for Control of Stochastic Linear Systems May 14, 2025 Reinforcement Learning (RL) Safe Reinforcement Learning
— Unverified 0