SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 66016625 of 15113 papers

TitleStatusHype
Functional Optimization Reinforcement Learning for Real-Time Bidding0
Guided Exploration in Reinforcement Learning via Monte Carlo Critic OptimizationCode0
Hierarchical Reinforcement Learning with Opponent Modeling for Distributed Multi-agent Cooperation0
Dynamic network congestion pricing based on deep reinforcement learning0
Learning the policy for mixed electric platoon control of automated and human-driven vehicles at signalized intersection: a random search approach0
Joint Representation Training in Sequential Tasks with Shared Structure0
Eco-driving for Electric Connected Vehicles at Signalized Intersections: A Parameterized Reinforcement Learning approach0
Phasic Self-Imitative Reduction for Sparse-Reward Goal-Conditioned Reinforcement Learning0
Reinforcement learning based adaptive metaheuristicsCode0
Modeling Adaptive Platoon and Reservation Based Autonomous Intersection Control: A Deep Reinforcement Learning Approach0
Provably Efficient Reinforcement Learning in Partially Observable Dynamical Systems0
Value Function Decomposition for Iterative Design of Reinforcement Learning Agents0
Nearly Minimax Optimal Reinforcement Learning with Linear Function Approximation0
The Real Deal: A Review of Challenges and Opportunities in Moving Reinforcement Learning-Based Traffic Signal Control Systems Towards Reality0
Reinforcement Learning under Partial Observability Guided by Learned Environment Models0
Recursive Reinforcement Learning0
A Federated Reinforcement Learning Method with Quantization for Cooperative Edge Caching in Fog Radio Access Networks0
Learning Agile Skills via Adversarial Imitation of Rough Partial Demonstrations0
CGAR: Critic Guided Action Redistribution in Reinforcement LeaningCode0
Learning Optimal Treatment Strategies for Sepsis Using Offline Reinforcement Learning in Continuous Space0
Curious Exploration via Structured World Models Yields Zero-Shot Object Manipulation0
Decentralized Gossip-Based Stochastic Bilevel Optimization over Communication Networks0
Fusion of Model-free Reinforcement Learning with Microgrid Control: Review and Vision0
Multi-Horizon Representations with Hierarchical Forward Models for Reinforcement LearningCode0
Auto-Encoding Adversarial Imitation Learning0
Show:102550
← PrevPage 265 of 605Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified