SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 12011250 of 15113 papers

TitleStatusHype
Connecting Deep-Reinforcement-Learning-based Obstacle Avoidance with Conventional Global Planners using Waypoint GeneratorsCode1
Deep Reinforcement Learning at the Edge of the Statistical PrecipiceCode1
Game-Theoretic Multiagent Reinforcement LearningCode1
Deep Reinforcement Learning based Group Recommender SystemCode1
Deep Reinforcement Learning Control of Quantum CartpolesCode1
Deep Reinforcement Learning for Cryptocurrency Trading: Practical Approach to Address Backtest OverfittingCode1
Action Guidance: Getting the Best of Sparse Rewards and Shaped Rewards for Real-time Strategy GamesCode1
Comparing Deep Reinforcement Learning Algorithms in Two-Echelon Supply ChainsCode1
Deep Reinforcement Learning for Computational Fluid Dynamics on HPC SystemsCode1
Deep Reinforcement Learning for Conservation DecisionsCode1
Conservative Q-Learning for Offline Reinforcement LearningCode1
Deep Reinforcement Learning for Joint Spectrum and Power Allocation in Cellular NetworksCode1
A Distributional Perspective on Reinforcement LearningCode1
Deep Reinforcement Learning for List-wise RecommendationsCode1
Conditional Mutual Information for Disentangled Representations in Reinforcement LearningCode1
Concise Reasoning via Reinforcement LearningCode1
Deep Reinforcement Learning for Solving the Heterogeneous Capacitated Vehicle Routing ProblemCode1
AnyBipe: An End-to-End Framework for Training and Deploying Bipedal Robots Guided by Large Language ModelsCode1
Toward Deep Supervised Anomaly Detection: Reinforcement Learning from Partially Labeled Anomaly DataCode1
Deep Reinforcement Learning for URLLC data management on top of scheduled eMBB trafficCode1
Confidence Estimation Transformer for Long-term Renewable Energy Forecasting in Reinforcement Learning-based Power Grid DispatchingCode1
An Inductive Bias for Distances: Neural Nets that Respect the Triangle InequalityCode1
Compound AI Systems Optimization: A Survey of Methods, Challenges, and Future DirectionsCode1
Deep Reinforcement Learning with Population-Coded Spiking Neural Network for Continuous ControlCode1
Computational Performance of Deep Reinforcement Learning to find Nash EquilibriaCode1
Are Expressive Models Truly Necessary for Offline RL?Code1
Zero-Shot Reinforcement Learning from Low Quality DataCode1
Compiler Optimization for Quantum Computing Using Reinforcement LearningCode1
Deep Symbolic Superoptimization Without Human KnowledgeCode1
Deep Transformer Q-Networks for Partially Observable Reinforcement LearningCode1
Competitiveness of MAP-Elites against Proximal Policy Optimization on locomotion tasks in deterministic simulationsCode1
Compile Scene Graphs with Reinforcement LearningCode1
Comparing Observation and Action Representations for Deep Reinforcement Learning in μRTSCode1
Comparing Popular Simulation Environments in the Scope of Robotics and Reinforcement LearningCode1
A Policy-Guided Imitation Approach for Offline Reinforcement LearningCode1
Denoised MDPs: Learning World Models Better Than the World ItselfCode1
Deployment-Efficient Reinforcement Learning via Model-Based Offline OptimizationCode1
Compositional Reinforcement Learning from Logical SpecificationsCode1
CommonPower: A Framework for Safe Data-Driven Smart Grid ControlCode1
Combining Semantic Guidance and Deep Reinforcement Learning For Generating Human Level PaintingsCode1
Communicative Reinforcement Learning Agents for Landmark Detection in Brain ImagesCode1
Dialogue for Prompting: a Policy-Gradient-Based Discrete Prompt Generation for Few-shot LearningCode1
Improving Planning with Large Language Models: A Modular Agentic ArchitectureCode1
Visual Grounding for Object-Level Generalization in Reinforcement LearningCode1
A Probabilistic Interpretation of Self-Paced Learning with Applications to Reinforcement LearningCode1
Benchmarking Multi-Agent Deep Reinforcement Learning Algorithms in Cooperative TasksCode1
Diffusion Policies creating a Trust Region for Offline Reinforcement LearningCode1
CompoSuite: A Compositional Reinforcement Learning BenchmarkCode1
Reliable Conditioning of Behavioral Cloning for Offline Reinforcement LearningCode1
Contrastive Reinforcement Learning of Symbolic Reasoning DomainsCode1
Show:102550
← PrevPage 25 of 303Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified