SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 1275112800 of 15113 papers

TitleStatusHype
Learning to Decompose Compound Questions with Reinforcement Learning0
Backplay: 'Man muss immer umkehren'0
Automata Guided Skill Composition0
Learning to Control Visual Abstractions for Structured Exploration in Deep Reinforcement Learning0
Deep reinforcement learning with relational inductive biases0
Few-Shot Intent Inference via Meta-Inverse Reinforcement Learning0
A Guider Network for Multi-Dual Learning0
Efficient Model-free Reinforcement Learning in Metric SpacesCode0
Learning Heuristics for Automated Reasoning through Reinforcement Learning0
Learning Goal-Conditioned Value Functions with one-step Path rewards rather than Goal-Rewards0
A new dog learns old tricks: RL finds classic optimization algorithms0
Learning to Progressively Plan0
Learning to Reinforcement Learn by Imitation0
Information-Theoretic Considerations in Batch Reinforcement Learning0
Inducing Cooperation via Learning to reshape rewards in semi-cooperative multi-agent reinforcement learning0
Learning To Solve Circuit-SAT: An Unsupervised Differentiable Approach0
Learning agents with prioritization and parameter noise in continuous state and action space0
Driving with Style: Inverse Reinforcement Learning in General-Purpose Planning for Automated Driving0
Learning Actionable Representations with Goal Conditioned Policies0
DHER: Hindsight Experience Replay for Dynamic GoalsCode0
ACTRCE: Augmenting Experience via Teacher’s Advice0
Explicit Recall for Efficient Exploration0
SIMILE: Introducing Sequential Information towards More Effective Imitation Learning0
Soft Q-Learning with Mutual-Information Regularization0
M^3RL: Mind-aware Multi-agent Management Reinforcement Learning0
Sample-efficient policy learning in multi-agent Reinforcement Learning via meta-learning0
Uncovering Surprising Behaviors in Reinforcement Learning via Worst-case Analysis0
SUPERVISED POLICY UPDATECode0
Predicted Variables in Programming0
NEURAL MALWARE CONTROL WITH DEEP REINFORCEMENT LEARNING0
Understanding & Generalizing AlphaGo Zero0
Visceral Machines: Reinforcement Learning with Intrinsic Physiological Rewards0
Recurrent Experience Replay in Distributed Reinforcement LearningCode0
Towards Consistent Performance on Atari using Expert Demonstrations0
Rating Continuous Actions in Spatial Multi-Agent Problems0
Modeling the Long Term Future in Model-Based Reinforcement Learning0
Generative Adversarial Imagination for Sample Efficient Deep Reinforcement Learning0
Argus: Smartphone-enabled Human Cooperation via Multi-Agent Reinforcement Learning for Disaster Situational Awareness0
Reinforcement Learning Scheduler for Vehicle-to-Vehicle Communications Outside Coverage0
RL-GAN-Net: A Reinforcement Learning Agent Controlled GAN Network for Real-Time Point Cloud Shape CompletionCode0
Deep Neuroevolution of Recurrent and Discrete World ModelsCode0
Arbitrage of Energy Storage in Electricity Markets with Deep Reinforcement Learning0
Self Training Autonomous Driving Agent0
Ray Interference: a Source of Plateaus in Deep Reinforcement Learning0
Safe Reinforcement Learning with Scene Decomposition for Navigating Complex Urban EnvironmentsCode0
Deep Reinforcement Learning for Optimal Critical Care Pain Management with Morphine using Dueling Double-Deep Q Networks0
Continuous-Time Mean-Variance Portfolio Selection: A Reinforcement Learning FrameworkCode0
Cognitive Radar Using Reinforcement Learning in Automotive Applications0
Grounding Natural Language Commands to StarCraft II Game States for Narration-Guided Reinforcement Learning0
How You Act Tells a Lot: Privacy-Leakage Attack on Deep Reinforcement Learning0
Show:102550
← PrevPage 256 of 303Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified