SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 16511675 of 15113 papers

TitleStatusHype
AI2-THOR: An Interactive 3D Environment for Visual AICode1
Efficient Active Search for Combinatorial Optimization ProblemsCode1
Efficient Adversarial Training without Attacking: Worst-Case-Aware Robust Reinforcement LearningCode1
Attacking Cooperative Multi-Agent Reinforcement Learning by Adversarial Minority InfluenceCode1
GUI-G1: Understanding R1-Zero-Like Training for Visual Grounding in GUI AgentsCode1
Hierarchical clustering in particle physics through reinforcement learningCode1
Efficient Continuous Control with Double Actors and Regularized CriticsCode1
On the model-based stochastic value gradient for continuous reinforcement learningCode1
Rethinking the Implementation Matters in Cooperative Multi-Agent Reinforcement LearningCode1
Adaptive Behavior Cloning Regularization for Stable Offline-to-Online Reinforcement LearningCode1
Efficient Diffusion Policies for Offline Reinforcement LearningCode1
On Uncertainty in Deep State Space Models for Model-Based Reinforcement LearningCode1
An Encoder-Decoder Based Audio Captioning System With Transfer and Reinforcement LearningCode1
Gradient Surgery for Multi-Task LearningCode1
Graph Constrained Reinforcement Learning for Natural Language Action SpacesCode1
Efficient Model-Based Reinforcement Learning through Optimistic Policy Search and PlanningCode1
Efficient Pressure: Improving efficiency for signalized intersectionsCode1
Tactile Sim-to-Real Policy Transfer via Real-to-Sim Image TranslationCode1
Bayesian Action Decoder for Deep Multi-Agent Reinforcement LearningCode1
Bayesian Generational Population-Based TrainingCode1
Optimal Transport for Offline Imitation LearningCode1
Efficient Recurrent Off-Policy RL Requires a Context-Encoder-Specific Learning RateCode1
Efficient Reinforcement Learning Through Trajectory GenerationCode1
Efficient Reinforcement Learning in Block MDPs: A Model-free Representation Learning ApproachCode1
Graph Convolutional Memory using Topological PriorsCode1
Show:102550
← PrevPage 67 of 605Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified