SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 61016150 of 15113 papers

TitleStatusHype
Learning to Grasp the Ungraspable with Emergent Extrinsic Dexterity0
Learning to grow: control of material self-assembly using evolutionary reinforcement learning0
Learning to Guide a Saturation-Based Theorem Prover0
Learning to Guide Multiple Heterogeneous Actors from a Single Human Demonstration via Automatic Curriculum Learning in StarCraft II0
Learning to Herd Agents Amongst Obstacles: Training Robust Shepherding Behaviors using Deep Reinforcement Learning0
Learning to Infer Unseen Contexts in Causal Contextual Reinforcement Learning0
Learning to Influence Human Behavior with Offline Reinforcement Learning0
Learning to Interrupt: A Hierarchical Deep Reinforcement Learning Framework for Efficient Exploration0
Learning to Learn: Meta-Critic Networks for Sample Efficient Learning0
Learning to Locomote: Understanding How Environment Design Matters for Deep Reinforcement Learning0
Learning to Locomote with Deep Neural-Network and CPG-based Control in a Soft Snake Robot0
Learning to Minimize Age of Information over an Unreliable Channel with Energy Harvesting0
Learning to Mitigate AI Collusion on Economic Platforms0
Learning to Mix n-Step Returns: Generalizing lambda-Returns for Deep Reinforcement Learning0
Learning to Navigate the Web0
Learning to Observe with Reinforcement Learning0
Learning to Operate an Electric Vehicle Charging Station Considering Vehicle-grid Integration0
Learning to Operate in Open Worlds by Adapting Planning Models0
Learning to Optimize0
Learning to Optimize LSM-trees: Towards A Reinforcement Learning based Key-Value Store for Dynamic Workloads0
Learning to Optimize Neural Nets0
Learning to Optimize Permutation Flow Shop Scheduling via Graph-based Imitation Learning0
Learning to Order Sub-questions for Complex Question Answering0
Learning to Perform Physics Experiments via Deep Reinforcement Learning0
Learning to Plan via Deep Optimistic Value Exploration0
Learning to Play Pong using Policy Gradient Learning0
Learning to Play Soccer by Reinforcement and Applying Sim-to-Real to Compete in the Real World0
Learning to Play Table Tennis From Scratch using Muscular Robots0
Learning to Play Two-Player Perfect-Information Games without Knowledge0
Learning to predict where to look in interactive environments using deep recurrent q-learning0
Learning to Program Variational Quantum Circuits with Fast Weights0
Learning to Progressively Plan0
Learning to Provably Satisfy High Relative Degree Constraints for Black-Box Systems0
Learning to Prune Deep Neural Networks via Reinforcement Learning0
Learning to Query Internet Text for Informing Reinforcement Learning Agents0
Learning to Reach Goals Without Reinforcement Learning0
Learning to Reason: Distilling Hierarchy via Self-Supervision and Reinforcement Learning0
Learning to Reason in Large Theories without Imitation0
Learning to Recover Sparse Signals0
Learning to Reinforcement Learn by Imitation0
Learning to Repeat: Fine Grained Action Repetition for Deep Reinforcement Learning0
Learning to Represent Action Values as a Hypergraph on the Action Vertices0
Learning to Resolve Alliance Dilemmas in Many-Player Zero-Sum Games0
Learning to Reweight Imaginary Transitions for Model-Based Reinforcement Learning0
Learning Torque Control for Quadrupedal Locomotion0
Learning to Run challenge: Synthesizing physiologically accurate motion using deep reinforcement learning0
Learning to Run with Potential-Based Reward Shaping and Demonstrations from Video Data0
Learning to Sail Dynamic Networks: The MARLIN Reinforcement Learning Framework for Congestion Control in Tactical Environments0
Learning to sample in Cartesian MRI0
Learning to Sample with Local and Global Contexts in Experience Replay Buffer0
Show:102550
← PrevPage 123 of 303Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified