SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 99019950 of 15113 papers

TitleStatusHype
Semi-Supervised Off Policy Reinforcement Learning0
The Architectural Implications of Distributed Reinforcement Learning on CPU-GPU Systems0
Resolving Implicit Coordination in Multi-Agent Deep Reinforcement Learning with Deep Q-Networks & Game TheoryCode0
Emergence of Different Modes of Tool Use in a Reaching and Dragging Task0
Efficient Reservoir Management through Deep Reinforcement Learning0
Battery Model Calibration with Deep Reinforcement Learning0
Selective Pseudo-Labeling with Reinforcement Learning for Semi-Supervised Domain Adaptation0
Vehicular Cooperative Perception Through Action Branching and Federated Reinforcement Learning0
Fever Basketball: A Complex, Flexible, and Asynchronized Sports Game Environment for Multi-agent Reinforcement Learning0
Data Boost: Text Data Augmentation Through Reinforcement Learning Guided Conditional Generation0
Multi-agent navigation based on deep reinforcement learning and traditional pathfinding algorithm0
Neural Dynamic Policies for End-to-End Sensorimotor Learning0
Offline Meta-level Model-based Reinforcement Learning Approach for Cold-Start Recommendation0
Model-Agnostic Learning to Meta-Learn0
Demonstration-efficient Inverse Reinforcement Learning in Procedurally Generated Environments0
Emergent Complexity and Zero-shot Transfer via Unsupervised Environment DesignCode0
Dynamic RAN Slicing for Service-Oriented Vehicular Networks via Constrained Learning0
DeepCrawl: Deep Reinforcement Learning for Turn-based Strategy Games0
Designing a Prospective COVID-19 Therapeutic with Reinforcement Learning0
Partially Connected Automated Vehicle Cooperative Control Strategy with a Deep Reinforcement Learning Approach0
A Safe Reinforcement Learning Architecture for Antenna Tilt Optimisation0
Pareto Deterministic Policy Gradients and Its Application in 5G Massive MIMO Networks0
Sample Complexity of Policy Gradient Finding Second-Order Stationary Points0
Coinbot: Intelligent Robotic Coin Bag Manipulation Using Deep Reinforcement Learning And Machine Teaching0
Convergence Proof for Actor-Critic Methods Applied to PPO and RUDDER0
Driving-Policy Adaptive Safeguard for Autonomous Vehicles Using Reinforcement Learning0
Are Gradient-based Saliency Maps Useful in Deep Reinforcement Learning?0
BSODA: A Bipartite Scalable Framework for Online Disease Diagnosis0
Combining Cognitive Modeling and Reinforcement Learning for Clarification in Dialogue0
Is Long Horizon RL More Difficult Than Short Horizon RL?0
ExpanRL: Hierarchical Reinforcement Learning for Course Concept Expansion in MOOCs0
EcoLight: Intersection Control in Developing Regions Under Extreme Budget and Network Constraints0
Improving Neural Machine Translation for Sanskrit-English0
Improving the Naturalness and Diversity of Referring Expression Generation models using Minimum Risk Training0
Almost Optimal Model-Free Reinforcement Learningvia Reference-Advantage Decomposition0
Assessing and Accelerating Coverage in Deep Reinforcement Learning0
Instance-based Generalization in Reinforcement Learning0
Leverage the Average: an Analysis of KL Regularization in Reinforcement Learning0
Answer-driven Deep Question Generation based on Reinforcement Learning0
A Local Temporal Difference Code for Distributional Reinforcement Learning0
A Learning-Exploring Method to Generate Diverse Paraphrases with Multi-Objective Deep Reinforcement Learning0
A new convergent variant of Q-learning with linear function approximation0
Can Temporal-Difference and Q-Learning Learn Representation? A Mean-Field Theory0
Robust Multi-Agent Reinforcement Learning with Model Uncertainty0
Promoting Stochasticity for Expressive Policies via a Simple and Efficient Regularization Method0
Security Analysis of Safe and Seldonian Reinforcement Learning Algorithms0
R-learning in actor-critic model offers a biologically relevant mechanism for sequential decision-making0
RL Unplugged: A Collection of Benchmarks for Offline Reinforcement LearningCode0
Text Simplification with Reinforcement Learning Using Supervised Rewards on Grammaticality, Meaning Preservation, and Simplicity0
On Efficiency in Hierarchical Reinforcement Learning0
Show:102550
← PrevPage 199 of 303Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified