What About Taking Policy as Input of Value Function: Policy-extended Value Function Approximator Sep 28, 2020 continuous-control Continuous Control
— Unverified 0What are the Statistical Limits of Batch RL with Linear Function Approximation? Jan 1, 2021 reinforcement-learning Reinforcement Learning
— Unverified 0What are the Statistical Limits of Offline RL with Linear Function Approximation? Oct 22, 2020 Decision Making Offline RL
— Unverified 0What Can RL Bring to VLA Generalization? An Empirical Study May 26, 2025 Reinforcement Learning (RL) Vision-Language-Action
— Unverified 0What can you do with a rock? Affordance extraction via word embeddings Mar 9, 2017 Affordance Detection Reinforcement Learning
— Unverified 0What deep reinforcement learning tells us about human motor learning and vice-versa Aug 23, 2022 Decision Making Deep Reinforcement Learning
— Unverified 0What Does The User Want? Information Gain for Hierarchical Dialogue Policy Optimisation Sep 15, 2021 Dialogue Management Management
— Unverified 0What is Going on Inside Recurrent Meta Reinforcement Learning Agents? Apr 29, 2021 Meta Reinforcement Learning reinforcement-learning
— Unverified 0What is Interpretable? Using Machine Learning to Design Interpretable Decision-Support Systems Nov 27, 2018 BIG-bench Machine Learning Reinforcement Learning
— Unverified 0What is the Reward for Handwriting? -- Handwriting Generation by Imitation Learning Sep 23, 2020 Handwriting generation Imitation Learning
— Unverified 0What Matters for On-Policy Deep Actor-Critic Methods? A Large-Scale Study Jan 1, 2021 Attribute continuous-control
— Unverified 0What Robot do I Need? Fast Co-Adaptation of Morphology and Control using Graph Neural Networks Nov 3, 2021 Deep Reinforcement Learning reinforcement-learning
— Unverified 0What's Behind PPO's Collapse in Long-CoT? Value Optimization Holds the Secret Mar 3, 2025 Math Reinforcement Learning (RL)
— Unverified 0What Should I Do Now? Marrying Reinforcement Learning and Symbolic Planning Jan 6, 2019 Deep Reinforcement Learning Question Answering
— Unverified 0What Would pi* Do?: Imitation Learning via Off-Policy Reinforcement Learning Sep 27, 2018 Imitation Learning Q-Learning
— Unverified 0(When) Are Contrastive Explanations of Reinforcement Learning Helpful? Nov 14, 2022 reinforcement-learning Reinforcement Learning
— Unverified 0When Autonomous Systems Meet Accuracy and Transferability through AI: A Survey Mar 29, 2020 Deblurring Decision Making
— Unverified 0When Can Large Reasoning Models Save Thinking? Mechanistic Analysis of Behavioral Divergence in Reasoning May 21, 2025 Reinforcement Learning (RL)
— Unverified 0When Collaborative Filtering Meets Reinforcement Learning Feb 2, 2019 Collaborative Filtering Interactive Recommendation
— Unverified 0When Do Drivers Concentrate? Attention-based Driver Behavior Modeling With Deep Reinforcement Learning Feb 26, 2020 Deep Reinforcement Learning reinforcement-learning
— Unverified 0When is Agnostic Reinforcement Learning Statistically Tractable? Oct 9, 2023 reinforcement-learning Reinforcement Learning
— Unverified 0When is a Prediction Knowledge? Apr 18, 2019 Decision Making Prediction
— Unverified 0When Is Generalizable Reinforcement Learning Tractable? Jan 1, 2021 reinforcement-learning Reinforcement Learning
— Unverified 0When is Offline Two-Player Zero-Sum Markov Game Solvable? Jan 10, 2022 Multi-agent Reinforcement Learning reinforcement-learning
— Unverified 0When Is Partially Observable Reinforcement Learning Not Scary? Apr 19, 2022 Partially Observable Reinforcement Learning reinforcement-learning
— Unverified 0When is Realizability Sufficient for Off-Policy Reinforcement Learning? Nov 10, 2022 reinforcement-learning Reinforcement Learning
— Unverified 0When Learning Is Out of Reach, Reset: Generalization in Autonomous Visuomotor Reinforcement Learning Mar 30, 2023 Reinforcement Learning (RL)
— Unverified 0When Mining Electric Locomotives Meet Reinforcement Learning Nov 14, 2023 reinforcement-learning Reinforcement Learning
— Unverified 0When Multiple Agents Learn to Schedule: A Distributed Radio Resource Management Framework Jun 20, 2019 Deep Reinforcement Learning Management
— Unverified 0Provably Robust Blackbox Optimization for Reinforcement Learning Mar 7, 2019 MuJoCo reinforcement-learning
— Unverified 0When should agents explore? Aug 26, 2021 Diversity Reinforcement Learning (RL)
— Unverified 0When Should We Prefer Offline Reinforcement Learning Over Behavioral Cloning? Apr 12, 2022 Atari Games Diagnostic
— Unverified 0When Simple Exploration is Sample Efficient: Identifying Sufficient Conditions for Random Exploration to Yield PAC RL Algorithms May 23, 2018 Efficient Exploration Q-Learning
— Unverified 0When to Go, and When to Explore: The Benefit of Post-Exploration in Intrinsic Motivation Mar 29, 2022 Reinforcement Learning (RL)
— Unverified 0When to Localize? A Risk-Constrained Reinforcement Learning Approach Nov 5, 2024 reinforcement-learning Reinforcement Learning
— Unverified 0When to Trust Your Data: Enhancing Dyna-Style Model-Based Reinforcement Learning With Data Filter Oct 16, 2024 Model-based Reinforcement Learning Reinforcement Learning (RL)
— Unverified 0Membership Inference Attacks Against Temporally Correlated Data in Deep Reinforcement Learning Sep 8, 2021 Adversarial Attack continuous-control
— Unverified 0Where Off-Policy Deep Reinforcement Learning Fails Sep 27, 2018 continuous-control Continuous Control
— Unverified 0Where the Action is: Let's make Reinforcement Learning for Stochastic Dynamic Vehicle Routing Problems work! Feb 28, 2021 Reinforcement Learning (RL)
— Unverified 0Where to go next: Learning a Subgoal Recommendation Policy for Navigation Among Pedestrians Feb 25, 2021 Collision Avoidance Deep Reinforcement Learning
— Unverified 0Where to Look: A Unified Attention Model for Visual Recognition with Reinforcement Learning Nov 13, 2021 Q-Learning Reinforcement Learning (RL)
— Unverified 0Which Channel to Ask My Question? Personalized Customer Service RequestStream Routing using DeepReinforcement Learning Nov 24, 2019 Chatbot Deep Reinforcement Learning
— Unverified 0Which Mutual-Information Representation Learning Objectives are Sufficient for Control? Jun 14, 2021 Reinforcement Learning (RL) Representation Learning
— Unverified 0Whittle index based Q-learning for restless bandits with average reward Apr 29, 2020 Q-Learning reinforcement-learning
— Unverified 0Who Are the Best Adopters? User Selection Model for Free Trial Item Promotion Feb 19, 2022 Marketing reinforcement-learning
— Unverified 0Whole-body End-Effector Pose Tracking Sep 24, 2024 Pose Tracking Reinforcement Learning (RL)
— Unverified 0Why Does Hierarchy (Sometimes) Work So Well in Reinforcement Learning? Sep 23, 2019 Hierarchical Reinforcement Learning reinforcement-learning
— Unverified 0Why Generalization in RL is Difficult: Epistemic POMDPs and Implicit Partial Observability Jul 13, 2021 Reinforcement Learning (RL)
— Unverified 0Why Guided Dialog Policy Learning performs well? Understanding the role of adversarial learning and its alternative Jul 13, 2023 Reinforcement Learning (RL)
— Unverified 0Why is Posterior Sampling Better than Optimism for Reinforcement Learning? Jul 1, 2016 reinforcement-learning Reinforcement Learning
— Unverified 0