SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 12511300 of 15113 papers

TitleStatusHype
Discovering General Reinforcement Learning Algorithms with Adversarial Environment DesignCode1
Discovering Hierarchical Achievements in Reinforcement Learning via Contrastive LearningCode1
Discovering Reinforcement Learning AlgorithmsCode1
Discrete Codebook World Models for Continuous ControlCode1
Compositional Reinforcement Learning from Logical SpecificationsCode1
DISK: Learning local features with policy gradientCode1
Distilling Motion Planner Augmented Policies into Visual Control Policies for Robot ManipulationCode1
Distilling Reinforcement Learning Algorithms for In-Context Model-Based PlanningCode1
Compile Scene Graphs with Reinforcement LearningCode1
Distributed Control of Partial Differential Equations Using Convolutional Reinforcement LearningCode1
CompoSuite: A Compositional Reinforcement Learning BenchmarkCode1
Distributed Online Service Coordination Using Deep Reinforcement LearningCode1
Accelerating Deep Reinforcement Learning for Digital Twin Network Optimization with Evolutionary StrategiesCode1
A Reinforcement Learning Environment For Job-Shop SchedulingCode1
DittoGym: Learning to Control Soft Shape-Shifting RobotsCode1
Diverse Policy Optimization for Structured Action SpaceCode1
DMC-VB: A Benchmark for Representation Learning for Control with Visual DistractorsCode1
DMR: Decomposed Multi-Modality Representations for Frames and Events Fusion in Visual Reinforcement LearningCode1
Competitiveness of MAP-Elites against Proximal Policy Optimization on locomotion tasks in deterministic simulationsCode1
Do Embodied Agents Dream of Pixelated Sheep: Embodied Decision Making using Language Guided World ModellingCode1
Visual Grounding for Object-Level Generalization in Reinforcement LearningCode1
Approximating Gradients for Differentiable Quality Diversity in Reinforcement LearningCode1
Don't Touch What Matters: Task-Aware Lipschitz Data Augmentation for Visual Reinforcement LearningCode1
Compiler Optimization for Quantum Computing Using Reinforcement LearningCode1
Advantage-Weighted Regression: Simple and Scalable Off-Policy Reinforcement LearningCode1
DPN: Decoupling Partition and Navigation for Neural Solvers of Min-max Vehicle Routing ProblemsCode1
Drama: Mamba-Enabled Model-Based Reinforcement Learning Is Sample and Parameter EfficientCode1
A Practical Two-Stage Recipe for Mathematical LLMs: Maximizing Accuracy with SFT and Efficiency with Reinforcement LearningCode1
Compound AI Systems Optimization: A Survey of Methods, Challenges, and Future DirectionsCode1
ConfuciuX: Autonomous Hardware Resource Assignment for DNN Accelerators using Reinforcement LearningCode1
Constraint-Guided Reinforcement Learning: Augmenting the Agent-Environment-InteractionCode1
Dropout Q-Functions for Doubly Efficient Reinforcement LearningCode1
Active Exploration for Inverse Reinforcement LearningCode1
DUMP: Automated Distribution-Level Curriculum Learning for RL-based LLM Post-trainingCode1
DxFormer: A Decoupled Automatic Diagnostic System Based on Decoder-Encoder Transformer with Dense Symptom RepresentationsCode1
Control-Oriented Model-Based Reinforcement Learning with Implicit DifferentiationCode1
Combining Semantic Guidance and Deep Reinforcement Learning For Generating Human Level PaintingsCode1
Reinforcement Learning for Combining Search Methods in the Calibration of Economic ABMsCode1
CommonPower: A Framework for Safe Data-Driven Smart Grid ControlCode1
A Production Scheduling Framework for Reinforcement Learning Under Real-World ConstraintsCode1
Adversarial Deep Reinforcement Learning for Improving the Robustness of Multi-agent Autonomous Driving PoliciesCode1
Echo Chamber: RL Post-training Amplifies Behaviors Learned in PretrainingCode1
Adversarial Deep Reinforcement Learning in Portfolio ManagementCode1
Edge Rewiring Goes Neural: Boosting Network Resilience without Rich FeaturesCode1
An Experimental Design Perspective on Model-Based Reinforcement LearningCode1
A reinforcement learning path planning approach for range-only underwater target localization with autonomous vehiclesCode1
Efficient Adversarial Training without Attacking: Worst-Case-Aware Robust Reinforcement LearningCode1
Efficient Continuous Control with Double Actors and Regularized CriticsCode1
A Crash Course on Reinforcement LearningCode1
Combining Reinforcement Learning with Model Predictive Control for On-Ramp MergingCode1
Show:102550
← PrevPage 26 of 303Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified