SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 93769400 of 15113 papers

TitleStatusHype
RADARS: Memory Efficient Reinforcement Learning Aided Differentiable Neural Architecture Search0
Radiology Report Generation via Multi-objective Preference Optimization0
RAD: Training an End-to-End Driving Policy via Large-Scale 3DGS-based Reinforcement Learning0
RAIDER: Reinforcement-aided Spear Phishing Detector0
Raijū: Reinforcement Learning-Guided Post-Exploitation for Automating Security Assessment of Network Systems0
RAIL: A modular framework for Reinforcement-learning-based Adversarial Imitation Learning0
Railway Operation Rescheduling System via Dynamic Simulation and Reinforcement Learning0
Raising Student Completion Rates with Adaptive Curriculum and Contextual Bandits0
Random Copolymer inverse design system orienting on Accurate discovering of Antimicrobial peptide-mimetic copolymers0
Random Ensemble Reinforcement Learning for Traffic Signal Control0
Randomized Policy Learning for Continuous State and Action MDPs0
Random Latent Exploration for Deep Reinforcement Learning0
Random Network Distillation as a Diversity Metric for Both Image and Text Generation0
RangL: A Reinforcement Learning Competition Platform0
Ranking Items in Large-Scale Item Search Engines with Reinforcement Learning0
Ranking sentences from product description & bullets for better search0
Rapid Learning of Spatial Representations for Goal-Directed Navigation Based on a Novel Model of Hippocampal Place Fields0
Rapid Locomotion via Reinforcement Learning0
Rapidly Personalizing Mobile Health Treatment Policies with Limited Data0
RAPID-RL: A Reconfigurable Architecture with Preemptive-Exits for Efficient Deep-Reinforcement Learning0
RAPID: Robust and Agile Planner Using Inverse Reinforcement Learning for Vision-Based Drone Navigation0
RAP: Runtime-Adaptive Pruning for LLM Inference0
RASR: Risk-Averse Soft-Robust MDPs with EVaR and Entropic Risk0
RaSS: Improving Denoising Diffusion Samplers with Reinforced Active Sampling Scheduler0
Rate-matching the regret lower-bound in the linear quadratic regulator with unknown dynamics0
Show:102550
← PrevPage 376 of 605Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified