SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 1280112850 of 15113 papers

TitleStatusHype
Which Model to Trust: Assessing the Influence of Models on the Performance of Reinforcement Learning Algorithms for Continuous Control TasksCode0
Reinforcement Learning with State Observation Costs in Action-Contingent Noiselessly Observable Markov Decision ProcessesCode0
MolOpt: Autonomous Molecular Geometry Optimization using Multi-Agent Reinforcement LearningCode0
Online Prototype Alignment for Few-shot Policy TransferCode0
Reinforcement Learning with Quantum Variational CircuitsCode0
Online Learning in Iterated Prisoner's Dilemma to Mimic Human BehaviorCode0
Tempo Adaptation in Non-stationary Reinforcement LearningCode0
Mean Actor CriticCode0
Towards Practical Multi-Object Manipulation using Relational Reinforcement LearningCode0
Temporal Alignment for History Representation in Reinforcement LearningCode0
Temporal Shift Reinforcement LearningCode0
What Hides behind Unfairness? Exploring Dynamics Fairness in Reinforcement LearningCode0
Visual Transfer between Atari Games using Competitive Reinforcement LearningCode0
Resolving Implicit Coordination in Multi-Agent Deep Reinforcement Learning with Deep Q-Networks & Game TheoryCode0
Shrink-Perturb Improves Architecture Mixing during Population Based Training for Neural Architecture SearchCode0
Understanding when Dynamics-Invariant Data Augmentations Benefit Model-Free Reinforcement Learning UpdatesCode0
Underwater Soft Fin Flapping Motion with Deep Neural Network Based Surrogate ModelCode0
Privacy-preserving Q-Learning with Functional Noise in Continuous State SpacesCode0
Natural Language Generation Using Reinforcement Learning with External RewardsCode0
Vanilla Gradient Descent for Oblique Decision TreesCode0
Temporal Difference Variational Auto-EncoderCode0
TabNAS: Rejection Sampling for Neural Architecture Search on Tabular DatasetsCode0
Towards Robust Deep Reinforcement Learning for Traffic Signal Control: Demand Surges, Incidents and Sensor FailuresCode0
Privacy-Preserving Q-Learning with Functional Noise in Continuous SpacesCode0
Reinforcement Learning with Perturbed RewardsCode0
Model-Based Offline Planning with Trajectory PruningCode0
Prioritized Soft Q-Decomposition for Lexicographic Reinforcement LearningCode0
Sim2Rec: A Simulator-based Decision-making Approach to Optimize Real-World Long-term User Engagement in Sequential Recommender SystemsCode0
Reinforcement Learning with Parameterized ActionsCode0
Online Game Level Generation from MusicCode0
Sim-Env: Decoupling OpenAI Gym Environments from Simulation ModelsCode0
Reinforcement Learning with Low-Complexity Liquid State MachinesCode0
Visual Transfer for Reinforcement Learning via Wasserstein Domain ConfusionCode0
Natural Environment Benchmarks for Reinforcement LearningCode0
Towards Robust Offline-to-Online Reinforcement Learning via Uncertainty and SmoothnessCode0
Model-based Lifelong Reinforcement Learning with Bayesian ExplorationCode0
Mapping Instructions and Visual Observations to Actions with Reinforcement LearningCode0
HMM for Discovering Decision-Making Dynamics Using Reinforcement Learning ExperimentsCode0
Temporally-Extended ε-Greedy ExplorationCode0
SimpleDS: A Simple Deep Reinforcement Learning Dialogue SystemCode0
Red Teaming with Mind Reading: White-Box Adversarial Policies Against RL AgentsCode0
Simple Noisy Environment Augmentation for Reinforcement LearningCode0
Reinforcement Learning with Euclidean Data Augmentation for State-Based Continuous ControlCode0
Boosting Exploration in Multi-Task Reinforcement Learning using Adversarial NetworksCode0
Rethinking Model-based, Policy-based, and Value-based Reinforcement Learning via the Lens of Representation ComplexityCode0
Simple random search of static linear policies is competitive for reinforcement learningCode0
Rethinking Out-of-Distribution Detection for Reinforcement Learning: Advancing Methods for Evaluation and DetectionCode0
Principled Exploration via Optimistic Bootstrapping and Backward InductionCode0
simple_rl: Reproducible Reinforcement Learning in PythonCode0
Pre-training with Non-expert Human Demonstration for Deep Reinforcement LearningCode0
Show:102550
← PrevPage 257 of 303Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified