SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 12261250 of 15113 papers

TitleStatusHype
Are Expressive Models Truly Necessary for Offline RL?Code1
Zero-Shot Reinforcement Learning from Low Quality DataCode1
Compiler Optimization for Quantum Computing Using Reinforcement LearningCode1
Deep Symbolic Superoptimization Without Human KnowledgeCode1
Deep Transformer Q-Networks for Partially Observable Reinforcement LearningCode1
Competitiveness of MAP-Elites against Proximal Policy Optimization on locomotion tasks in deterministic simulationsCode1
Compile Scene Graphs with Reinforcement LearningCode1
Comparing Observation and Action Representations for Deep Reinforcement Learning in μRTSCode1
Comparing Popular Simulation Environments in the Scope of Robotics and Reinforcement LearningCode1
A Policy-Guided Imitation Approach for Offline Reinforcement LearningCode1
Denoised MDPs: Learning World Models Better Than the World ItselfCode1
Deployment-Efficient Reinforcement Learning via Model-Based Offline OptimizationCode1
Compositional Reinforcement Learning from Logical SpecificationsCode1
CommonPower: A Framework for Safe Data-Driven Smart Grid ControlCode1
Combining Semantic Guidance and Deep Reinforcement Learning For Generating Human Level PaintingsCode1
Communicative Reinforcement Learning Agents for Landmark Detection in Brain ImagesCode1
Dialogue for Prompting: a Policy-Gradient-Based Discrete Prompt Generation for Few-shot LearningCode1
Improving Planning with Large Language Models: A Modular Agentic ArchitectureCode1
Visual Grounding for Object-Level Generalization in Reinforcement LearningCode1
A Probabilistic Interpretation of Self-Paced Learning with Applications to Reinforcement LearningCode1
Benchmarking Multi-Agent Deep Reinforcement Learning Algorithms in Cooperative TasksCode1
Diffusion Policies creating a Trust Region for Offline Reinforcement LearningCode1
CompoSuite: A Compositional Reinforcement Learning BenchmarkCode1
Reliable Conditioning of Behavioral Cloning for Offline Reinforcement LearningCode1
Contrastive Reinforcement Learning of Symbolic Reasoning DomainsCode1
Show:102550
← PrevPage 50 of 605Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified