SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 1495115000 of 15113 papers

TitleStatusHype
Learning Time-Sensitive Strategies in Space FortressCode0
Continuous-Time Mean-Variance Portfolio Selection: A Reinforcement Learning FrameworkCode0
Guided Exploration in Reinforcement Learning via Monte Carlo Critic OptimizationCode0
Exchangeable Models in Meta Reinforcement LearningCode0
Guided Feature Transformation (GFT): A Neural Language Grounding Module for Embodied AgentsCode0
Learning from Trajectories via Subgoal DiscoveryCode0
Action Priors for Large Action Spaces in RoboticsCode0
Continuous Doubly Constrained Batch Reinforcement LearningCode0
ExIt-OOS: Towards Learning from Planning in Imperfect Information GamesCode0
Guided Policy Optimization under Partial ObservabilityCode0
Incorporating Rivalry in Reinforcement Learning for a Competitive GameCode0
Learning When to Think: Shaping Adaptive Reasoning in R1-Style Models via Multi-Stage RLCode0
An Atari Model Zoo for Analyzing, Visualizing, and Comparing Deep Reinforcement Learning AgentsCode0
LEACH-RLC: Enhancing IoT Data Transmission with Optimized Clustering and Reinforcement LearningCode0
GuideLight: "Industrial Solution" Guidance for More Practical Traffic Signal Control AgentsCode0
Continuous Deep Q-Learning with Simulator for Stabilization of Uncertain Discrete-Time SystemsCode0
Guiding Evolutionary Strategies by Differentiable Robot SimulatorsCode0
Bayesian Inference with Anchored Ensembles of Neural Networks, and Application to Exploration in Reinforcement LearningCode0
Increasing Data Efficiency of Driving Agent By World ModelCode0
Bayesian Design Principles for Offline-to-Online Reinforcement LearningCode0
Continuous Control With Ensemble Deep Deterministic Policy GradientsCode0
Bayesian Curiosity for Efficient Exploration in Reinforcement LearningCode0
Experiential Explanations for Reinforcement LearningCode0
Adaptive Discretization for Episodic Reinforcement Learning in Metric SpacesCode0
Increasing performance of electric vehicles in ride-hailing services using deep reinforcement learningCode0
Experimental evaluation of offline reinforcement learning for HVAC control in buildingsCode0
Increasing the Action Gap: New Operators for Reinforcement LearningCode0
Bayes Adaptive Monte Carlo Tree Search for Offline Model-based Reinforcement LearningCode0
Learning Generalizable Device Placement Algorithms for Distributed Machine LearningCode0
Batch Value-function Approximation with Only RealizabilityCode0
Expert-Free Online Transfer Learning in Multi-Agent Reinforcement LearningCode0
Learning Generalizable Representations for Reinforcement Learning via Adaptive Meta-learner of Behavioral SimilaritiesCode0
Expert Proximity as Surrogate Rewards for Single Demonstration Imitation LearningCode0
gym-gazebo2, a toolkit for reinforcement learning using ROS 2 and GazeboCode0
Continuous-action Reinforcement Learning for Playing Racing Games: Comparing SPG to PPOCode0
EXPIL: Explanatory Predicate Invention for Learning in GamesCode0
Gym-Ignition: Reproducible Robotic Simulations for Reinforcement LearningCode0
BaRC: Backward Reachability Curriculum for Robotic Reinforcement LearningCode0
Balancing Value Underestimation and Overestimation with Realistic Actor-CriticCode0
Explainable Action Advising for Multi-Agent Reinforcement LearningCode0
Adaptive Diffusion Policy Optimization for Robotic ManipulationCode0
Attention-Based Reward Shaping for Sparse and Delayed RewardsCode0
Explainable and Safe Reinforcement Learning for Autonomous Air MobilityCode0
Learning to Ask Medical Questions using Reinforcement LearningCode0
An analysis of Reinforcement Learning applied to Coach task in IEEE Very Small Size SoccerCode0
Balancing the Scales: Reinforcement Learning for Fair ClassificationCode0
Adversarial Online Multi-Task Reinforcement LearningCode0
Learning to Schedule Communication in Multi-agent Reinforcement LearningCode0
Balancing detectability and performance of attacks on the control channel of Markov Decision ProcessesCode0
Continual Task Learning through Adaptive Policy Self-CompositionCode0
Show:102550
← PrevPage 300 of 303Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified