SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 12011225 of 15113 papers

TitleStatusHype
B-Pref: Benchmarking Preference-Based Reinforcement LearningCode1
Federated Reinforcement Learning with Environment HeterogeneityCode1
Game-Theoretic Multiagent Reinforcement LearningCode1
AutoPhase: Compiler Phase-Ordering for High Level Synthesis with Deep Reinforcement LearningCode1
End-to-End Affordance Learning for Robotic ManipulationCode1
ENERO: Efficient Real-Time WAN Routing Optimization with Deep Reinforcement LearningCode1
Autonomous Exploration Under Uncertainty via Deep Reinforcement Learning on GraphsCode1
Emergent collective intelligence from massive-agent cooperation and competitionCode1
Bridging the Gap Between f-GANs and Wasserstein GANsCode1
Bridging State and History Representations: Understanding Self-Predictive RLCode1
Accelerated Sim-to-Real Deep Reinforcement Learning: Learning Collision Avoidance from Human PlayerCode1
Bayesian Soft Actor-Critic: A Directed Acyclic Strategy Graph Based Deep Reinforcement LearningCode1
Emergent Real-World Robotic Skills via Unsupervised Off-Policy Reinforcement LearningCode1
Building a Foundation for Data-Driven, Interpretable, and Robust Policy Design using the AI EconomistCode1
Automating DBSCAN via Deep Reinforcement LearningCode1
CaiRL: A High-Performance Reinforcement Learning Environment ToolkitCode1
Emergent behavior and neural dynamics in artificial agents tracking turbulent plumesCode1
AnyBipe: An End-to-End Framework for Training and Deploying Bipedal Robots Guided by Large Language ModelsCode1
Can Increasing Input Dimensionality Improve Deep Reinforcement Learning?Code1
Can Question Rewriting Help Conversational Question Answering?Code1
Can Learned Optimization Make Reinforcement Learning Less Difficult?Code1
Can Q-Learning with Graph Networks Learn a Generalizable Branching Heuristic for a SAT Solver?Code1
Avalanche RL: a Continual Reinforcement Learning LibraryCode1
Automatic Truss Design with Reinforcement LearningCode1
Barrier Certified Safety Learning Control: When Sum-of-Square Programming Meets Reinforcement LearningCode1
Show:102550
← PrevPage 49 of 605Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified