SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 33013325 of 15113 papers

TitleStatusHype
A Reinforcement Learning Approach to Sensing Design in Resource-Constrained Wireless Networked Control SystemsCode0
A Tree Search Algorithm for Sequence LabelingCode0
Hierarchical Cooperative Multi-Agent Reinforcement Learning with Skill DiscoveryCode0
Active One-shot LearningCode0
Reinforcement Learning from Hierarchical CriticsCode0
A reinforcement learning approach to rare trajectory samplingCode0
Health-Informed Policy Gradients for Multi-Agent Reinforcement LearningCode0
HDDLGym: A Tool for Studying Multi-Agent Hierarchical Problems Defined in HDDL with OpenAI GymCode0
Health Text Simplification: An Annotated Corpus for Digestive Cancer Education and Novel Strategies for Reinforcement LearningCode0
Heuristics, Answer Set Programming and Markov Decision Process for Solving a Set of Spatial PuzzlesCode0
Harnessing Structures for Value-Based Planning and Reinforcement LearningCode0
Hierarchically Structured Task-Agnostic Continual LearningCode0
Hint assisted reinforcement learning: an application in radio astronomyCode0
Hybrid Reinforcement Learning with Expert State SequencesCode0
Improving Robustness of Deep Reinforcement Learning Agents: Environment Attack based on the Critic NetworkCode0
A Reinforcement Learning Approach to Domain-Knowledge Inclusion Using Grammar Guided Symbolic RegressionCode0
Computational Benefits of Intermediate Rewards for Goal-Reaching Policy LearningCode0
gym-gazebo2, a toolkit for reinforcement learning using ROS 2 and GazeboCode0
Gym-Ignition: Reproducible Robotic Simulations for Reinforcement LearningCode0
Guiding Evolutionary Strategies by Differentiable Robot SimulatorsCode0
Guided Exploration in Reinforcement Learning via Monte Carlo Critic OptimizationCode0
Guided Dialogue Policy Learning without Adversarial Learning in the LoopCode0
Guided Feature Transformation (GFT): A Neural Language Grounding Module for Embodied AgentsCode0
Guided Dialog Policy Learning: Reward Estimation for Multi-Domain Task-Oriented DialogCode0
Adversarial Online Multi-Task Reinforcement LearningCode0
Show:102550
← PrevPage 133 of 605Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified