SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 61016125 of 15113 papers

TitleStatusHype
Learning to Grasp the Ungraspable with Emergent Extrinsic Dexterity0
Learning to grow: control of material self-assembly using evolutionary reinforcement learning0
Learning to Guide a Saturation-Based Theorem Prover0
Learning to Guide Multiple Heterogeneous Actors from a Single Human Demonstration via Automatic Curriculum Learning in StarCraft II0
Learning to Herd Agents Amongst Obstacles: Training Robust Shepherding Behaviors using Deep Reinforcement Learning0
Learning to Infer Unseen Contexts in Causal Contextual Reinforcement Learning0
Learning to Influence Human Behavior with Offline Reinforcement Learning0
Learning to Interrupt: A Hierarchical Deep Reinforcement Learning Framework for Efficient Exploration0
Learning to Learn: Meta-Critic Networks for Sample Efficient Learning0
Learning to Locomote: Understanding How Environment Design Matters for Deep Reinforcement Learning0
Learning to Locomote with Deep Neural-Network and CPG-based Control in a Soft Snake Robot0
Learning to Minimize Age of Information over an Unreliable Channel with Energy Harvesting0
Learning to Mitigate AI Collusion on Economic Platforms0
Learning to Mix n-Step Returns: Generalizing lambda-Returns for Deep Reinforcement Learning0
Learning to Navigate the Web0
Learning to Observe with Reinforcement Learning0
Learning to Operate an Electric Vehicle Charging Station Considering Vehicle-grid Integration0
Learning to Operate in Open Worlds by Adapting Planning Models0
Learning to Optimize0
Learning to Optimize LSM-trees: Towards A Reinforcement Learning based Key-Value Store for Dynamic Workloads0
Learning to Optimize Neural Nets0
Learning to Optimize Permutation Flow Shop Scheduling via Graph-based Imitation Learning0
Learning to Order Sub-questions for Complex Question Answering0
Learning to Perform Physics Experiments via Deep Reinforcement Learning0
Learning to Plan via Deep Optimistic Value Exploration0
Show:102550
← PrevPage 245 of 605Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified