SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 1475114800 of 15113 papers

TitleStatusHype
Lifelong Inverse Reinforcement LearningCode0
Counterfactual State Explanations for Reinforcement Learning Agents via Generative Deep LearningCode0
Counterfactual Explanation with Multi-Agent Reinforcement Learning for Drug Target PredictionCode0
Learning to Reach Goals via Iterated Supervised LearningCode0
Introspective Experience Replay: Look Back When SurprisedCode0
Goal Recognition as Reinforcement LearningCode0
Counterfactual Explanations for Continuous Action Reinforcement LearningCode0
Counterfactual-Augmented Importance Sampling for Semi-Offline Policy EvaluationCode0
Enhancing New-item Fairness in Dynamic Recommender SystemsCode0
Enhancing Online Reinforcement Learning with Meta-Learned Objective from Offline DataCode0
A Robust Quantile Huber Loss With Interpretable Parameter Adjustment In Distributional Reinforcement LearningCode0
Adaptive Partial Scanning Transmission Electron Microscopy with Reinforcement LearningCode0
Adaptive Ordered Information Extraction with Deep Reinforcement LearningCode0
Adaptive Natural Language Generation for Task-oriented Dialogue via Reinforcement LearningCode0
Enhancing variational quantum state diagonalization using reinforcement learning techniquesCode0
Counterexample Guided RL Policy Refinement Using Bayesian OptimizationCode0
Beyond Worst-case Attacks: Robust RL with Adaptive Defense via Non-dominated PoliciesCode0
An Empirical Comparison on Imitation Learning and Reinforcement Learning for Paraphrase GenerationCode0
Learning Task Agnostic Skills with Data-driven GuidanceCode0
ACING: Actor-Critic for Instruction Learning in Black-Box Large Language ModelsCode0
Count-Based Exploration with the Successor RepresentationCode0
A review on Deep Reinforcement Learning for Fluid MechanicsCode0
Beyond Optimism: Exploration With Partially Observable RewardsCode0
Enhancing Safety for Autonomous Agents in Partly Concealed Urban Traffic Environments Through Representation-Based ShieldingCode0
Count-Based Exploration in Feature Space for Reinforcement LearningCode0
Cost Effective MLaaS Federation: A Combinatorial Reinforcement Learning ApproachCode0
Beyond Confidence Regions: Tight Bayesian Ambiguity Sets for Robust MDPsCode0
Google Research Football: A Novel Reinforcement Learning EnvironmentCode0
Gossip-based Actor-Learner Architectures for Deep Reinforcement LearningCode0
GoSum: Extractive Summarization of Long Documents by Reinforcement Learning and Graph Organized discourse stateCode0
Enhancing Visual Dialog Questioner with Entity-based Strategy Learning and Augmented GuesserCode0
Co-Speech Gesture Synthesis by Reinforcement Learning With Contrastive Pre-Trained RewardsCode0
Ensemble and Auxiliary Tasks for Data-Efficient Deep Reinforcement LearningCode0
Gotta Learn Fast: A New Benchmark for Generalization in RLCode0
An Efficient Deep Reinforcement Learning Model for Urban Traffic ControlCode0
Better Safe than Sorry: Evidence Accumulation Allows for Safe Reinforcement LearningCode0
Controlled Interacting Particle Algorithms for Simulation-based Reinforcement LearningCode0
Corruption-Robust Offline Reinforcement Learning with General Function ApproximationCode0
Accelerating Reinforcement Learning through GPU Atari EmulationCode0
Action Robust Reinforcement Learning and Applications in Continuous ControlCode0
GRAC: Self-Guided and Self-Regularized Actor-CriticCode0
Better Rewards Yield Better Summaries: Learning to Summarise Without ReferencesCode0
Adaptively Calibrated Critic Estimates for Deep Reinforcement LearningCode0
Adaptive Gain Scheduling using Reinforcement Learning for Quadcopter ControlCode0
BertRLFuzzer: A BERT and Reinforcement Learning Based FuzzerCode0
Is Value Functions Estimation with Classification Plug-and-play for Offline Reinforcement Learning?Code0
Entity Abstraction in Visual Model-Based Reinforcement LearningCode0
Correct Me If You Can: Learning from Error Corrections and MarkingsCode0
An Efficient Combinatorial Optimization Model Using Learning-to-Rank DistillationCode0
Improving Optimization Bounds using Machine Learning: Decision Diagrams meet Deep Reinforcement LearningCode0
Show:102550
← PrevPage 296 of 303Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified