SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 1485114900 of 15113 papers

TitleStatusHype
Learning to Play General-Sum Games Against Multiple Boundedly Rational AgentsCode0
Improving Sample Efficiency of Reinforcement Learning with Background Knowledge from Large Language ModelsCode0
Behavior Estimation from Multi-Source Data for Offline Reinforcement LearningCode0
Learning to reinforcement learn for Neural Architecture SearchCode0
Behavior-based Neuroevolutionary Training in Reinforcement LearningCode0
A Reminder of its Brittleness: Language Reward Shaping May Hinder Learning for Instruction Following AgentsCode0
Learning Temporally-Consistent Representations for Data-Efficient Reinforcement LearningCode0
Control Synthesis from Linear Temporal Logic Specifications using Model-Free Reinforcement LearningCode0
IxDRL: A Novel Explainable Deep Reinforcement Learning Toolkit based on Analyses of InterestingnessCode0
Lifelong Reinforcement Learning with Modulating MasksCode0
Estimating Risk and Uncertainty in Deep Reinforcement LearningCode0
Adversarial Skill Networks: Unsupervised Robot Skill Learning from VideoCode0
Estimation Error Correction in Deep Reinforcement Learning for Deterministic Actor-Critic MethodsCode0
Estimation of Warfarin Dosage with Reinforcement LearningCode0
Andes_gym: A Versatile Environment for Deep Reinforcement Learning in Power SystemsCode0
Ethical Challenges in Data-Driven Dialogue SystemsCode0
Deep Reinforcement Learning using Genetic Algorithm for Parameter OptimizationCode0
Control Regularization for Reduced Variance Reinforcement LearningCode0
Learning from Ambiguous Demonstrations with Self-Explanation Guided Reinforcement LearningCode0
Learning to reset in target search problemsCode0
Log Barriers for Safe Black-box Optimization with Application to Safe Reinforcement LearningCode0
Anchor-Changing Regularized Natural Policy Gradient for Multi-Objective Reinforcement LearningCode0
GREEN-CODE: Learning to Optimize Energy Efficiency in LLM-based Code GenerationCode0
Control of Rayleigh-Bénard Convection: Effectiveness of Reinforcement Learning in the Turbulent RegimeCode0
Improving the Efficient Neural Architecture Search via Rewarding ModificationsCode0
Beating the World's Best at Super Smash Bros. with Deep Reinforcement LearningCode0
Green Simulation Assisted Reinforcement Learning with Model Risk for Biomanufacturing Learning and ControlCode0
Jet grooming through reinforcement learningCode0
An Autonomous Performance Testing Framework using Self-Adaptive Fuzzy Reinforcement LearningCode0
Least-Squares Policy IterationCode0
Learning from Demonstration without DemonstrationsCode0
Beating Atari with Natural Language Guided Reinforcement LearningCode0
Control of nonlinear, complex and black-boxed greenhouse system with reinforcement learningCode0
Bayesian Robust Optimization for Imitation LearningCode0
Improving the Performance of Backward Chained Behavior Trees that use Reinforcement LearningCode0
Evaluating the Paperclip Maximizer: Are RL-Based Language Models More Likely to Pursue Instrumental Goals?Code0
Improving thermal state preparation of Sachdev-Ye-Kitaev model with reinforcement learning on quantum hardwareCode0
Control of Continuous Quantum Systems with Many Degrees of Freedom based on Convergent Reinforcement LearningCode0
Evaluating the Robustness of Deep Reinforcement Learning for Autonomous Policies in a Multi-agent Urban Driving EnvironmentCode0
A Reinforcement Learning Framework for Dynamic Mediation AnalysisCode0
Improving the sample-efficiency of neural architecture search with reinforcement learningCode0
A Reinforcement Learning Approach to Sensing Design in Resource-Constrained Wireless Networked Control SystemsCode0
Improving Unsupervised Hierarchical Representation with Reinforcement LearningCode0
An Autonomous Non-monolithic Agent with Multi-mode Exploration based on Options FrameworkCode0
Join Query Optimization with Deep Reinforcement Learning AlgorithmsCode0
Adaptive Estimator Selection for Off-Policy EvaluationCode0
Controlling Large Language Model with Latent ActionsCode0
Gradient Importance Learning for Incomplete ObservationsCode0
Learning the Optimal Power Flow: Environment Design MattersCode0
Bayesian Optimization with Robust Bayesian Neural NetworksCode0
Show:102550
← PrevPage 298 of 303Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified