SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 62516275 of 15113 papers

TitleStatusHype
Limited Query Graph Connectivity Test0
Limits of Actor-Critic Algorithms for Decision Tree Policies Learning in IBMDPs0
Lineage Evolution Reinforcement Learning0
Linear Bellman Completeness Suffices for Efficient Online Reinforcement Learning with Few Actions0
Linear Complementarity for Regularized Policy Evaluation and Improvement0
Linear convergence of a policy gradient method for some finite horizon continuous time control problems0
Linear Feature Encoding for Reinforcement Learning0
Linear interpolation gives better gradients than Gaussian smoothing in derivative-free optimization0
Linear-Quadratic Mean-Field Reinforcement Learning: Convergence of Policy Gradient Methods0
Logarithmic regret for episodic continuous-time linear-quadratic reinforcement learning over a finite-time horizon0
Linear Reinforcement Learning with Ball Structure Action Space0
Linear Representation Meta-Reinforcement Learning for Instant Adaptation0
Linear Stochastic Approximation: Constant Step-Size and Iterate Averaging0
LISPR: An Options Framework for Policy Reuse with Reinforcement Learning0
Listener-Rewarded Thinking in VLMs for Image Preferences0
LlamaRL: A Distributed Asynchronous Reinforcement Learning Framework for Efficient Large-scale LLM Trainin0
LLM Alignment as Retriever Optimization: An Information Retrieval Perspective0
LLM Augmented Hierarchical Agents0
LLM-Augmented Symbolic Reinforcement Learning with Landmark-Based Task Decomposition0
LLM-based Multi-Agent Reinforcement Learning: Current and Future Directions0
LLM-Based Offline Learning for Embodied Agents via Consistency-Guided Reward Ensemble0
LLM-Explorer: A Plug-in Reinforcement Learning Policy Exploration Enhancement Driven by Large Language Models0
LLM-hRIC: LLM-empowered Hierarchical RAN Intelligent Control for O-RAN0
LLMs for Engineering: Teaching Models to Design High Powered Rockets0
LLMs Meet Finance: Fine-Tuning Foundation Models for the Open FinLLM Leaderboard0
Show:102550
← PrevPage 251 of 605Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified