SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 34763500 of 15113 papers

TitleStatusHype
Deep Q-Learning based Reinforcement Learning Approach for Network Intrusion DetectionCode0
Guided Exploration in Reinforcement Learning via Monte Carlo Critic OptimizationCode0
Gym-Ignition: Reproducible Robotic Simulations for Reinforcement LearningCode0
Reinforcement Learning from Hierarchical CriticsCode0
Deep Q-learning from DemonstrationsCode0
Optimistic Distributionally Robust Policy OptimizationCode0
Green Simulation Assisted Reinforcement Learning with Model Risk for Biomanufacturing Learning and ControlCode0
Automata Learning meets ShieldingCode0
GREEN-CODE: Learning to Optimize Energy Efficiency in LLM-based Code GenerationCode0
Grounding Language for Transfer in Deep Reinforcement LearningCode0
ARCHER: Aggressive Rewards to Counter bias in Hindsight Experience ReplayCode0
Graph Convolutional Reinforcement LearningCode0
Automated Curriculum Learning by Rewarding Temporally Rare EventsCode0
Grammars and reinforcement learning for molecule optimizationCode0
Gradual Transition from Bellman Optimality Operator to Bellman Operator in Online Reinforcement LearningCode0
Graph Backup: Data Efficient Backup Exploiting Markovian TransitionsCode0
Improving the Data-efficiency of Reinforcement Learning by Warm-starting with LLMCode0
GraphNAS: Graph Neural Architecture Search with Reinforcement LearningCode0
Gotta Learn Fast: A New Benchmark for Generalization in RLCode0
GoSum: Extractive Summarization of Long Documents by Reinforcement Learning and Graph Organized discourse stateCode0
Optimizing Warfarin Dosing using Deep Reinforcement LearningCode0
Automated Discovery of Local Rules for Desired Collective-Level Behavior Through Reinforcement LearningCode0
Accelerating Reinforcement Learning through GPU Atari EmulationCode0
Combining Reinforcement Learning and Optimal Transport for the Traveling Salesman ProblemCode0
ARAML: A Stable Adversarial Training Framework for Text GenerationCode0
Show:102550
← PrevPage 140 of 605Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified