SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 1495115000 of 15113 papers

TitleStatusHype
An agent-driven semantical identifier using radial basis neural networks and reinforcement learning0
RESEARCH ARTICLE A Reinforcement Learning Model of Joy, Distress, Hope and Fear0
Inverse Reinforcement Learning with Multi-Relational Chains for Robot-Centered Smart Home0
Probabilistic inverse reinforcement learning in unknown environments0
Learning to Cooperate via Policy Search0
Reinforcement Learning of Cooperative Persuasive Dialogue Policies using Framing0
MONEYBaRL: Exploiting pitcher decision-making using Reinforcement Learning0
Learning in games via reinforcement and regularization0
Practical Kernel-Based Reinforcement Learning0
Extreme State Aggregation Beyond MDPs0
Reinforcement Learning Based Algorithm for the Maximization of EV Charging Station Revenue0
Thompson Sampling for Learning Parameterized Markov Decision Processes0
Reinforcement and Imitation Learning via Interactive No-Regret Learning0
Deterministic Policy Gradient AlgorithmsCode0
Personalized Medical Treatments Using Novel Reinforcement Learning Algorithms0
Multi-objective Reinforcement Learning with Continuous Pareto Frontier Approximation Supplementary Material0
Model-based Reinforcement Learning and the Eluder Dimension0
Single-Agent vs. Multi-Agent Techniques for Concurrent Reinforcement Learning of Negotiation Dialogue Policies0
Comparing Multi-label Classification with Reinforcement Learning for Summarisation of Time-series Data0
Proximal Reinforcement Learning: A New Theory of Sequential Decision Making in Primal-Dual Spaces0
Projective simulation applied to the grid-world and the mountain-car problem0
Off-Policy Shaping Ensembles in Reinforcement Learning0
Structural Return Maximization for Reinforcement Learning0
Selecting Near-Optimal Approximate State Representations in Reinforcement Learning0
DINASTI: Dialogues with a Negotiating Appointment Setting Interface0
Deep Learning in Neural Networks: An OverviewCode0
Undirected Machine Translation with Discriminative Reinforcement Learning0
Comparison of Multi-agent and Single-agent Inverse Learning on a Simulated Soccer Example0
Multi-agent Inverse Reinforcement Learning for Two-person Zero-sum Games0
Simultaneous Perturbation Algorithms for Batch Off-Policy Search0
Near-optimal Reinforcement Learning in Factored MDPs0
Intrinsically Motivated Learning of Visual Motion Perception and Smooth Pursuit0
Better Optimism By Bayes: Adaptive Planning with Rich Models0
Generalization and Exploration via Randomized Value FunctionsCode0
Safe Exploration of State and Action Spaces in Reinforcement Learning0
Non-Deterministic Policies in Markovian Decision Processes0
Kalman Temporal Differences0
Learning Partially Observable Deterministic Action Models0
A Multiagent Reinforcement Learning Algorithm with Non-linear Dynamics0
Exploiting generalisation symmetries in accuracy-based learning classifier systems: An initial study0
DJ-MC: A Reinforcement-Learning Agent for Music Playlist Recommendation0
Optimal Demand Response Using Device Based Reinforcement Learning0
Policy Shaping: Integrating Human Feedback with Reinforcement Learning0
Projected Natural Actor-Critic0
Reinforcement Learning in Robust Markov Decision Processes0
Using reinforcement learning to find an optimal set of featuresCode0
Bellman Error Based Feature Generation using Random Projections on Sparse Spaces0
Efficient Exploration and Value Function Generalization in Deterministic Systems0
Off-policy reinforcement learning for H_ control design0
Risk-sensitive Reinforcement Learning0
Show:102550
← PrevPage 300 of 303Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified