SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 1345113500 of 15113 papers

TitleStatusHype
Distilled Agent DQN for Provable Adversarial Robustness0
Exploration by Uncertainty in Reward Space0
A Convergent Variant of the Boltzmann Softmax Operator in Reinforcement Learning0
Learning Physics Priors for Deep Reinforcement Learing0
Learning to Coordinate Multiple Reinforcement Learning Agents for Diverse Query Reformulation0
Expressiveness in Deep Reinforcement Learning0
Exploiting Environmental Variation to Improve Policy Robustness in Reinforcement Learning0
Countering Language Drift via Grounding0
Deep Reinforcement Learning of Universal Policies with Diverse Environment Summaries0
Incremental Hierarchical Reinforcement Learning with Multitask LMDPs0
Convergent Reinforcement Learning with Function Approximation: A Bilevel Optimization Perspective0
Constraining Action Sequences with Formal Languages for Deep Reinforcement Learning0
Hybrid Policies Using Inverse Rewards for Reinforcement Learning0
Dynamic Pricing on E-commerce Platform with Deep Reinforcement Learning0
A Better Baseline for Second Order Gradient Estimation in Stochastic Computation Graphs0
DOMAIN ADAPTATION VIA DISTRIBUTION AND REPRESENTATION MATCHING: A CASE STUDY ON TRAINING DATA SELECTION VIA REINFORCEMENT LEARNING0
Accelerated Value Iteration via Anderson Mixing0
DEEP ADVERSARIAL FORWARD MODEL0
Definition and evaluation of model-free coordination of electrical vehicle charging with reinforcement learning0
Guided Exploration in Deep Reinforcement Learning0
Controllable Neural Story Plot Generation via Reward ShapingCode0
Learning Navigation Behaviors End-to-End with AutoRL0
Learning through Probing: a decentralized reinforcement learning architecture for social dilemmas0
AlphaSeq: Sequence Discovery with Deep Reinforcement Learning0
Omega-Regular Objectives in Model-Free Reinforcement Learning0
S-RL Toolbox: Environments, Datasets and Evaluation Metrics for State Representation LearningCode0
Resilient Computing with Reinforcement Learning on a Dynamical System: Case Study in Sorting0
Anderson Acceleration for Reinforcement Learning0
Hierarchical Deep Multiagent Reinforcement Learning with Temporal Abstraction0
Floyd-Warshall Reinforcement Learning: Learning from Past Experiences to Reach New Goals0
Low Precision Policy Distillation with Application to Low-Power, Real-time Sensation-Cognition-Action Loop with Neuromorphic Computing0
EpiRL: A Reinforcement Learning Agent to Facilitate Epistasis Detection0
Better Safe than Sorry: Evidence Accumulation Allows for Safe Reinforcement LearningCode0
Personalized Education at Scale0
SDN Flow Entry Management Using Reinforcement Learning0
On Reinforcement Learning for Full-length Game of StarCraft0
A Learning Framework for High Precision Industrial Assembly0
Geometric Multi-Model Fitting by Deep Reinforcement Learning0
Finite Sample Analysis of the GTD Policy Evaluation Algorithms in Markov Setting0
Interpretable Multi-Objective Reinforcement Learning through Policy Orchestration0
Constrained Exploration and Recovery from Experience ShapingCode0
Target Transfer Q-Learning and Its Convergence Analysis0
Sim-to-Real Transfer of Robot Learning with Variable Length Inputs0
Benchmarking Reinforcement Learning Algorithms on Real-World RobotsCode0
IntelligentCrowd: Mobile Crowdsensing via Multi-Agent Reinforcement Learning0
Dynamic Weights in Multi-Objective Deep Reinforcement LearningCode0
Interpretable Reinforcement Learning with Ensemble Methods0
Prosocial or Selfish? Agents with different behaviors for Contract Negotiation using Reinforcement Learning0
Model-Free Adaptive Optimal Control of Episodic Fixed-Horizon Manufacturing Processes using Reinforcement LearningCode0
SCC-rFMQ Learning in Cooperative Markov Games with Continuous Actions0
Show:102550
← PrevPage 270 of 303Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified