SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 12761300 of 15113 papers

TitleStatusHype
DPN: Decoupling Partition and Navigation for Neural Solvers of Min-max Vehicle Routing ProblemsCode1
Drama: Mamba-Enabled Model-Based Reinforcement Learning Is Sample and Parameter EfficientCode1
A Practical Two-Stage Recipe for Mathematical LLMs: Maximizing Accuracy with SFT and Efficiency with Reinforcement LearningCode1
Compound AI Systems Optimization: A Survey of Methods, Challenges, and Future DirectionsCode1
ConfuciuX: Autonomous Hardware Resource Assignment for DNN Accelerators using Reinforcement LearningCode1
Constraint-Guided Reinforcement Learning: Augmenting the Agent-Environment-InteractionCode1
Dropout Q-Functions for Doubly Efficient Reinforcement LearningCode1
Active Exploration for Inverse Reinforcement LearningCode1
DUMP: Automated Distribution-Level Curriculum Learning for RL-based LLM Post-trainingCode1
DxFormer: A Decoupled Automatic Diagnostic System Based on Decoder-Encoder Transformer with Dense Symptom RepresentationsCode1
Control-Oriented Model-Based Reinforcement Learning with Implicit DifferentiationCode1
Combining Semantic Guidance and Deep Reinforcement Learning For Generating Human Level PaintingsCode1
Reinforcement Learning for Combining Search Methods in the Calibration of Economic ABMsCode1
CommonPower: A Framework for Safe Data-Driven Smart Grid ControlCode1
A Production Scheduling Framework for Reinforcement Learning Under Real-World ConstraintsCode1
Adversarial Deep Reinforcement Learning for Improving the Robustness of Multi-agent Autonomous Driving PoliciesCode1
Echo Chamber: RL Post-training Amplifies Behaviors Learned in PretrainingCode1
Adversarial Deep Reinforcement Learning in Portfolio ManagementCode1
Edge Rewiring Goes Neural: Boosting Network Resilience without Rich FeaturesCode1
An Experimental Design Perspective on Model-Based Reinforcement LearningCode1
A reinforcement learning path planning approach for range-only underwater target localization with autonomous vehiclesCode1
Efficient Adversarial Training without Attacking: Worst-Case-Aware Robust Reinforcement LearningCode1
Efficient Continuous Control with Double Actors and Regularized CriticsCode1
A Crash Course on Reinforcement LearningCode1
Combining Reinforcement Learning with Model Predictive Control for On-Ramp MergingCode1
Show:102550
← PrevPage 52 of 605Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified