SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 1015110175 of 15113 papers

TitleStatusHype
On the Convergence of Reinforcement Learning with Monte Carlo Exploring Starts0
Soft Expert Reward Learning for Vision-and-Language Navigation0
Multi-agent Reinforcement Learning in Bayesian Stackelberg Markov Games for Adaptive Moving Target Defense0
A Machine Learning Approach for Task and Resource Allocation in Mobile Edge Computing Based Networks0
Lagrangian Duality in Reinforcement Learning0
Active MR k-space Sampling with Reinforcement LearningCode1
Interpretable Control by Reinforcement Learning0
Battlesnake Challenge: A Multi-agent Reinforcement Learning Playground with Human-in-the-loopCode1
A Short Note on Soft-max and Policy Gradients in Bandits Problems0
An Overview of Natural Language State Representation for Reinforcement Learning0
CATCH: Context-based Meta Reinforcement Learning for Transferrable Architecture Search0
Structure Mapping for Transferability of Causal ModelsCode0
Quick Question: Interrupting Users for Microtasks with Reinforcement Learning0
WordCraft: An Environment for Benchmarking Commonsense AgentsCode1
Off-Policy Reinforcement Learning for Efficient and Effective GAN Architecture SearchCode1
Hierarchical Deep Reinforcement Learning Approach for Multi-Objective Scheduling With Varying Queue Sizes0
Hyperparameter Selection for Offline Reinforcement Learning0
Discovering Reinforcement Learning AlgorithmsCode1
Human-like Energy Management Based on Deep Reinforcement Learning and Historical Driving Experiences0
Decision-making Strategy on Highway for Autonomous Vehicles using Deep Reinforcement Learning0
Distributed Reinforcement Learning of Targeted Grasping with Active Vision for Mobile Manipulators0
Dueling Deep Q Network for Highway Decision Making in Autonomous Vehicles: A Case Study0
DRIFT: Deep Reinforcement Learning for Functional Software Testing0
Collision Avoidance Robotics Via Meta-Learning (CARML)Code0
CoNES: Convex Natural Evolutionary Strategies0
Show:102550
← PrevPage 407 of 605Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified