Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1026–1050 of 15113 papers

Title	Date	Tasks	Status	Hype	Score
Avalanche RL: a Continual Reinforcement Learning Library	Feb 28, 2022	Continual LearningOpenAI Gym	CodeCode Available	1	5
EDGE: Explaining Deep Reinforcement Learning Policies	Dec 1, 2021	Deep Reinforcement LearningMuJoCo	CodeCode Available	1	5
AutoPhase: Juggling HLS Phase Orderings in Random Forests with Deep Reinforcement Learning	Mar 2, 2020	Deep Reinforcement LearningHigh-Level Synthesis	CodeCode Available	1	5
AutoPhase: Compiler Phase-Ordering for High Level Synthesis with Deep Reinforcement Learning	Jan 15, 2019	Deep Reinforcement LearningHigh-Level Synthesis	CodeCode Available	1	5
AutoPhoto: Aesthetic Photo Capture using Reinforcement Learning	Sep 21, 2021	reinforcement-learningReinforcement Learning	CodeCode Available	1	5
Avalon: A Benchmark for RL Generalization Using Procedurally Generated Worlds	Oct 24, 2022	Deep Reinforcement LearningNavigate	CodeCode Available	1	5
Accelerating Deep Reinforcement Learning for Digital Twin Network Optimization with Evolutionary Strategies	Feb 1, 2022	Deep Reinforcement LearningManagement	CodeCode Available	1	5
EAGER: Asking and Answering Questions for Automatic Reward Shaping in Language-guided RL	Jun 20, 2022	Question AnsweringQuestion Generation	CodeCode Available	1	5
Eagle: End-to-end Deep Reinforcement Learning based Autonomous Control of PTZ Cameras	Apr 10, 2023	Deep Reinforcement Learningobject-detection	CodeCode Available	1	5
Edge Rewiring Goes Neural: Boosting Network Resilience without Rich Features	Oct 18, 2021	Graph Neural Networkreinforcement-learning	CodeCode Available	1	5
DxFormer: A Decoupled Automatic Diagnostic System Based on Decoder-Encoder Transformer with Dense Symptom Representations	May 8, 2022	DecoderDiagnostic	CodeCode Available	1	5
Advancing Multimodal Reasoning via Reinforcement Learning with Cold Start	May 28, 2025	MathMultimodal Reasoning	CodeCode Available	1	5
Autonomous Racing using a Hybrid Imitation-Reinforcement Learning Architecture	Oct 11, 2021	Autonomous RacingAutonomous Vehicles	CodeCode Available	1	5
DuoGuard: A Two-Player RL-Driven Framework for Multilingual LLM Guardrails	Feb 7, 2025	Reinforcement Learning (RL)Synthetic Data Generation	CodeCode Available	1	5
Dynamic Sparse Training for Deep Reinforcement Learning	Jun 8, 2021	continuous-controlContinuous Control	CodeCode Available	1	5
Autonomous Exploration Under Uncertainty via Deep Reinforcement Learning on Graphs	Jul 24, 2020	Decision MakingDeep Reinforcement Learning	CodeCode Available	1	5
DTR-Bench: An in silico Environment and Benchmark Platform for Reinforcement Learning Based Dynamic Treatment Regime	May 28, 2024	BenchmarkingReinforcement Learning (RL)	CodeCode Available	1	5
Automatic Unit Test Data Generation and Actor-Critic Reinforcement Learning for Code Synthesis	Oct 20, 2023	Code GenerationLanguage Modelling	CodeCode Available	1	5
Automating DBSCAN via Deep Reinforcement Learning	Aug 9, 2022	ClusteringComputational Efficiency	CodeCode Available	1	5
Autonomous Reinforcement Learning: Formalism and Benchmarking	Dec 17, 2021	Benchmarkingreinforcement-learning	CodeCode Available	1	5
DUMP: Automated Distribution-Level Curriculum Learning for RL-based LLM Post-training	Apr 13, 2025	Reinforcement Learning (RL)	CodeCode Available	1	5
DyNODE: Neural Ordinary Differential Equations for Dynamics Modeling in Continuous Control	Sep 9, 2020	continuous-controlContinuous Control	CodeCode Available	1	5
Effective and Transparent RAG: Adaptive-Reward Reinforcement Learning for Decision Traceability	May 19, 2025	RAGReinforcement Learning (RL)	CodeCode Available	1	5
Action Space Shaping in Deep Reinforcement Learning	Apr 2, 2020	Deep Reinforcement Learningreinforcement-learning	CodeCode Available	1	5
DRL4Route: A Deep Reinforcement Learning Framework for Pick-up and Delivery Route Prediction	Jul 30, 2023	Deep Reinforcement Learningreinforcement-learning	CodeCode Available	1	5

Show:10 25 50

← PrevPage 42 of 605Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	PPG	Mean Normalized Performance	0.76	—	Unverified
2	PPO	Mean Normalized Performance	0.58	—	Unverified