SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 1430114350 of 15113 papers

TitleStatusHype
Deep Neuroevolution of Recurrent and Discrete World ModelsCode0
Intrinsically Efficient, Stable, and Bounded Off-Policy Evaluation for Reinforcement LearningCode0
Hybrid Transfer Reinforcement Learning: Provable Sample Efficiency from Shifted-Dynamics DataCode0
Deep Neuroevolution: Genetic Algorithms Are a Competitive Alternative for Training Deep Neural Networks for Reinforcement LearningCode0
Learning Action-Transferable Policy with Action EmbeddingCode0
LineFlow: A Framework to Learn Active Control of Production LinesCode0
DRLViz: Understanding Decisions and Memory in Deep Reinforcement LearningCode0
Deep Multi-Objective Reinforcement Learning for Utility-Based Infrastructural Maintenance OptimizationCode0
Learning Versatile Skills with Curriculum MaskingCode0
CityFlow: A Multi-Agent Reinforcement Learning Environment for Large Scale City Traffic ScenarioCode0
Deep Multi-Agent Reinforcement Learning with Relevance GraphsCode0
Deep Model-Based Reinforcement Learning via Estimated Uncertainty and Conservative Policy OptimizationCode0
DR-SAC: Distributionally Robust Soft Actor-Critic for Reinforcement Learning under UncertaintyCode0
Deep Learning in Neural Networks: An OverviewCode0
Augmented Q Imitation Learning (AQIL)Code0
Hype or Heuristic? Quantum Reinforcement Learning for Join Order OptimisationCode0
Q-Star Meets Scalable Posterior Sampling: Bridging Theory and Practice via HyperAgentCode0
Lipschitz Continuity in Model-based Reinforcement LearningCode0
Learning to Listen, Read, and Follow: Score Following as a Reinforcement Learning GameCode0
Deep Learning-based Predictive Control of Battery Management for Frequency RegulationCode0
Circular Microalgae-Based Carbon Control for Net ZeroCode0
Hyperbolic Discounting and Learning over Multiple HorizonsCode0
A Hierarchical Architecture for Sequential Decision-Making in Autonomous Driving using Deep Reinforcement LearningCode0
Attentive Multi-Task Deep Reinforcement LearningCode0
Fourier Features in Reinforcement Learning with Neural NetworksCode0
A Convergent Off-Policy Temporal Difference AlgorithmCode0
A Greedy Approach to Adapting the Trace Parameter for Temporal Difference LearningCode0
Dual Policy DistillationCode0
Attention-Based Model and Deep Reinforcement Learning for Distribution of Event Processing TasksCode0
Dueling Network Architectures for Deep Reinforcement LearningCode0
Dueling Posterior Sampling for Preference-Based Reinforcement LearningCode0
Explicitly Encouraging Low Fractional Dimensional Trajectories Via Reinforcement LearningCode0
Deep learning-based numerical methods for high-dimensional parabolic partial differential equations and backward stochastic differential equationsCode0
Deep Inverse Reinforcement Learning for Structural Evolution of Small MoleculesCode0
Deep Feature Space: A Geometrical PerspectiveCode0
Intrinsic fluctuations of reinforcement learning promote cooperationCode0
Attention-based Curiosity-driven Exploration in Deep Reinforcement LearningCode0
Hyperparameter Auto-tuning in Self-Supervised Robotic LearningCode0
Dynamically Optimal Treatment AllocationCode0
Learning Preferences for Interactive AutonomyCode0
Learning Principle of Least Action with Reinforcement LearningCode0
An Investigation of the Bias-Variance Tradeoff in Meta-GradientsCode0
CHEQ-ing the Box: Safe Variable Impedance Learning for Robotic PolishingCode0
Act-Then-Measure: Reinforcement Learning for Partially Observable Environments with Active MeasuringCode0
Dynamic Computational Time for Visual AttentionCode0
Hyperparameters in Contextual RL are Highly SituationalCode0
FREED++: Improving RL Agents for Fragment-Based Molecule Generation by Thorough ReproductionCode0
Dynamic Control of a Fiber Manufacturing Process using Deep Reinforcement LearningCode0
Free energy-based reinforcement learning using a quantum processorCode0
Intrinsic Rewards from Self-Organizing Feature Maps for Exploration in Reinforcement LearningCode0
Show:102550
← PrevPage 287 of 303Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified