SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 576600 of 15113 papers

TitleStatusHype
Human-centric Reward Optimization for Reinforcement Learning-based Automated Driving using Large Language ModelsCode1
Simulating the Economic Impact of Rationality through Reinforcement Learning and Agent-Based ModellingCode1
No Representation, No Trust: Connecting Representation, Collapse, and Trust Issues in PPOCode1
Pessimistic Value Iteration for Multi-Task Data Sharing in Offline Reinforcement LearningCode1
A fast balance optimization approach for charging enhancement of lithium-ion battery packs through deep reinforcement learningCode1
Preference Fine-Tuning of LLMs Should Leverage Suboptimal, On-Policy DataCode1
WROOM: An Autonomous Driving Approach for Off-Road NavigationCode1
Dataset Reset Policy Optimization for RLHFCode1
How Consistent are Clinicians? Evaluating the Predictability of Sepsis Disease Progression with Dynamics ModelsCode1
Electric Vehicle Routing Problem for Emergency Power Supply: Towards Telecom Base Station ReliefCode1
Entity-Centric Reinforcement Learning for Object Manipulation from PixelsCode1
The New Agronomists: Language Models are Experts in Crop ManagementCode1
TractOracle: towards an anatomically-informed reward function for RL-based tractographyCode1
Reinforcement Learning-based Receding Horizon Control using Adaptive Control Barrier Functions for Safety-Critical SystemsCode1
PeersimGym: An Environment for Solving the Task Offloading Problem with Reinforcement LearningCode1
Policy Bifurcation in Safe Reinforcement LearningCode1
HYDRA: A Hyper Agent for Dynamic Compositional Visual ReasoningCode1
Reinforcement Learning with Token-level Feedback for Controllable Text GenerationCode1
Diffusion-Reinforcement Learning Hierarchical Motion Planning in Multi-agent Adversarial GamesCode1
Sampling-based Safe Reinforcement Learning for Nonlinear Dynamical SystemsCode1
SplAgger: Split Aggregation for Meta-Reinforcement LearningCode1
Improving the Validity of Automatically Generated Feedback via Reinforcement LearningCode1
Large Language Models are Learnable Planners for Long-Term RecommendationCode1
Flexible Robust Beamforming for Multibeam Satellite Downlink using Reinforcement LearningCode1
How Can LLM Guide RL? A Value-Based ApproachCode1
Show:102550
← PrevPage 24 of 605Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified