SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 75517600 of 15113 papers

TitleStatusHype
Automated Reinforcement Learning: An Overview0
Dyna-T: Dyna-Q and Upper Confidence Bounds Applied to Trees0
The Recurrent Reinforcement Learning Crypto Agent0
Toddler-Guidance Learning: Impacts of Critical Period on Multimodal AI Agents0
Multi-echelon Supply Chains with Uncertain Seasonal Demands and Lead Times Using Deep Reinforcement Learning0
Task Independent Capsule-Based Agents for Deep Q-Learning0
Pavlovian Signalling with General Value Functions in Agent-Agent Temporal Decision Making0
STIR^2: Reward Relabelling for combined Reinforcement and Imitation Learning on sparse-reward tasks0
Active Reinforcement Learning -- A Roadmap Towards Curious Classifier Systems for Self-Adaptation0
Benchmarking Deep Reinforcement Learning Algorithms for Vision-based Robotics0
Automated Reinforcement Learning (AutoRL): A Survey and Open Problems0
Distributed Cooperative Multi-Agent Reinforcement Learning with Directed Coordination Graph0
Opportunities of Hybrid Model-based Reinforcement Learning for Cell Therapy Manufacturing Process Control0
State of the Art of User Simulation approaches for conversational information retrieval0
When is Offline Two-Player Zero-Sum Markov Game Solvable?0
A Multi-agent Reinforcement Learning Approach for Efficient Client Selection in Federated Learning0
Assessing Policy, Loss and Planning Combinations in Reinforcement Learning using a New Modular Architecture0
Neural Network Optimization for Reinforcement Learning Tasks Using Sparse Computations0
Offline Reinforcement Learning for Road Traffic Control0
Combining Reinforcement Learning and Inverse Reinforcement Learning for Asset Allocation Recommendations0
Offsetting Unequal Competition through RL-assisted Incentive Schemes0
Deep Learning-based Predictive Control of Battery Management for Frequency RegulationCode0
Learning Complex Spatial Behaviours in ABM: An Experimental Observational Study0
Deep Reinforcement Learning, a textbook0
Analyzing Micro-Founded General Equilibrium Models with Many Agents using Deep Reinforcement Learning0
Execute Order 66: Targeted Data Poisoning for Reinforcement Learning0
A Deeper Understanding of State-Based Critics in Multi-Agent Reinforcement Learning0
Actor-Critic Network for Q&A in an Adversarial Environment0
Toward Causal-Aware RL: State-Wise Action-Refined Temporal DifferenceCode0
Robust Algorithmic Collusion0
Reinforcement Learning for Task Specifications with Action-Constraints0
Temporal Complementarity-Guided Reinforcement Learning for Image-to-Video Person Re-Identification0
Transfer RL across Observation Feature Spaces via Model-Based Regularization0
Operator Deep Q-Learning: Zero-Shot Reward Transferring in Reinforcement Learning0
Toward Pareto Efficient Fairness-Utility Trade-off inRecommendation through Reinforcement Learning0
Symmetry-Aware Neural Architecture for Embodied Visual Exploration0
Joint Learning-Based Stabilization of Multiple Unknown Linear Systems0
A Surrogate-Assisted Controller for Expensive Evolutionary Reinforcement Learning0
Importance of Empirical Sample Complexity Analysis for Offline Reinforcement Learning0
Stochastic convex optimization for provably efficient apprenticeship learning0
A Theoretical Understanding of Gradient Bias in Meta-Reinforcement LearningCode0
Using Graph-Aware Reinforcement Learning to Identify Winning Strategies in Diplomacy Games (Student Abstract)0
Single-Shot Pruning for Offline Reinforcement Learning0
Robust Entropy-regularized Markov Decision Processes0
Stability-Preserving Automatic Tuning of PID Control with Reinforcement Learning0
Multi-Agent Reinforcement Learning via Adaptive Kalman Temporal Difference and Successor Representation0
Reversible Upper Confidence Bound Algorithm to Generate Diverse Optimized Candidates0
MORAL: Aligning AI with Human Norms through Multi-Objective Reinforced Active LearningCode0
Constructing a Good Behavior Basis for Transfer using Generalized Policy Updates0
Constraint Sampling Reinforcement Learning: Incorporating Expertise For Faster LearningCode0
Show:102550
← PrevPage 152 of 303Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified