SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 99019925 of 15113 papers

TitleStatusHype
Semi-Supervised Off Policy Reinforcement Learning0
The Architectural Implications of Distributed Reinforcement Learning on CPU-GPU Systems0
Resolving Implicit Coordination in Multi-Agent Deep Reinforcement Learning with Deep Q-Networks & Game TheoryCode0
Emergence of Different Modes of Tool Use in a Reaching and Dragging Task0
Efficient Reservoir Management through Deep Reinforcement Learning0
Battery Model Calibration with Deep Reinforcement Learning0
Selective Pseudo-Labeling with Reinforcement Learning for Semi-Supervised Domain Adaptation0
Vehicular Cooperative Perception Through Action Branching and Federated Reinforcement Learning0
Fever Basketball: A Complex, Flexible, and Asynchronized Sports Game Environment for Multi-agent Reinforcement Learning0
Data Boost: Text Data Augmentation Through Reinforcement Learning Guided Conditional Generation0
Multi-agent navigation based on deep reinforcement learning and traditional pathfinding algorithm0
Neural Dynamic Policies for End-to-End Sensorimotor Learning0
Offline Meta-level Model-based Reinforcement Learning Approach for Cold-Start Recommendation0
Model-Agnostic Learning to Meta-Learn0
Demonstration-efficient Inverse Reinforcement Learning in Procedurally Generated Environments0
Emergent Complexity and Zero-shot Transfer via Unsupervised Environment DesignCode0
Dynamic RAN Slicing for Service-Oriented Vehicular Networks via Constrained Learning0
DeepCrawl: Deep Reinforcement Learning for Turn-based Strategy Games0
Designing a Prospective COVID-19 Therapeutic with Reinforcement Learning0
Partially Connected Automated Vehicle Cooperative Control Strategy with a Deep Reinforcement Learning Approach0
A Safe Reinforcement Learning Architecture for Antenna Tilt Optimisation0
Pareto Deterministic Policy Gradients and Its Application in 5G Massive MIMO Networks0
Sample Complexity of Policy Gradient Finding Second-Order Stationary Points0
Coinbot: Intelligent Robotic Coin Bag Manipulation Using Deep Reinforcement Learning And Machine Teaching0
Convergence Proof for Actor-Critic Methods Applied to PPO and RUDDER0
Show:102550
← PrevPage 397 of 605Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified