SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 11511175 of 15113 papers

TitleStatusHype
Electric Vehicle Routing Problem for Emergency Power Supply: Towards Telecom Base Station ReliefCode1
AutoPhoto: Aesthetic Photo Capture using Reinforcement LearningCode1
AutoPhase: Compiler Phase-Ordering for High Level Synthesis with Deep Reinforcement LearningCode1
A Deep Reinforcement Learning Approach to Marginalized Importance Sampling with the Successor RepresentationCode1
Basis for Intentions: Efficient Inverse Reinforcement Learning using Past ExperienceCode1
Barrier Certified Safety Learning Control: When Sum-of-Square Programming Meets Reinforcement LearningCode1
Enforcing Policy Feasibility Constraints through Differentiable Projection for Energy OptimizationCode1
Batch Exploration with Examples for Scalable Robotic Reinforcement LearningCode1
AutoPhase: Juggling HLS Phase Orderings in Random Forests with Deep Reinforcement LearningCode1
Bayesian Action Decoder for Deep Multi-Agent Reinforcement LearningCode1
A Deep Reinforcement Learning Framework for the Financial Portfolio Management ProblemCode1
BayesSimIG: Scalable Parameter Inference for Adaptive Domain Randomization with IsaacGymCode1
Eigenoption Discovery through the Deep Successor RepresentationCode1
End-to-End Urban Driving by Imitating a Reinforcement Learning CoachCode1
Enhancing Graph Neural Network-based Fraud Detectors against Camouflaged FraudstersCode1
Enhancing LLM Reasoning with Iterative DPO: A Comprehensive Empirical InvestigationCode1
Efficient Risk-Averse Reinforcement LearningCode1
Autonomous Racing using a Hybrid Imitation-Reinforcement Learning ArchitectureCode1
Enhancing SAT solvers with glue variable predictionsCode1
Efficient Symptom Inquiring and Diagnosis via Adaptive Alignment of Reinforcement Learning and ClassificationCode1
An Open-Source Multi-Goal Reinforcement Learning Environment for Robotic Manipulation with PybulletCode1
Behavior From the Void: Unsupervised Active Pre-TrainingCode1
Entropy-Regularized Process Reward ModelCode1
Entropy-Regularized Token-Level Policy Optimization for Language Agent ReinforcementCode1
Efficient Reinforcement Learning in Block MDPs: A Model-free Representation Learning ApproachCode1
Show:102550
← PrevPage 47 of 605Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified