SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 301310 of 15113 papers

TitleStatusHype
Co-Reinforcement Learning for Unified Multimodal Understanding and GenerationCode1
Reinforcement Learning for Ballbot Navigation in Uneven TerrainCode1
The Cell Must Go On: Agar.io for Continual Reinforcement LearningCode1
Arctic-Text2SQL-R1: Simple Rewards, Strong Reasoning in Text-to-SQLCode3
RAP: Runtime-Adaptive Pruning for LLM Inference0
Backdoors in DRL: Four Environments Focusing on In-distribution Triggers0
Control of Renewable Energy Communities using AI and Real-World Data0
LARES: Latent Reasoning for Sequential Recommendation0
DeepRec: Towards a Deep Dive Into the Item Space with Large Language Model Based Recommendation0
AceReason-Nemotron: Advancing Math and Code Reasoning through Reinforcement Learning0
Show:102550
← PrevPage 31 of 1512Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified