SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 1335113400 of 15113 papers

TitleStatusHype
Think Smart, Act SMARL! Analyzing Probabilistic Logic Shields for Multi-Agent Reinforcement LearningCode0
Think Too Fast Nor Too Slow: The Computational Trade-off Between Planning And Reinforcement LearningCode0
Mixed-Initiative Level Design with RL BrushCode0
Third-Person Imitation LearningCode0
Unsupervised Video Object Segmentation for Deep Reinforcement LearningCode0
Solving Offline Reinforcement Learning with Decision Tree RegressionCode0
SafeLife 1.0: Exploring Side Effects in Complex EnvironmentsCode0
SafeLight: A Reinforcement Learning Method toward Collision-free Traffic Signal ControlCode0
Refining Few-Step Text-to-Multiview Diffusion via Reinforcement LearningCode0
XIRL: Cross-embodiment Inverse Reinforcement LearningCode0
Reducing Overestimation Bias in Multi-Agent Domains Using Double Centralized CriticsCode0
Safe Model-based Reinforcement Learning with Stability GuaranteesCode0
Reinforcement Learning of Self Enhancing Camera Image and Signal ProcessingCode0
Recursive generalized type-2 fuzzy radial basis function neural networks for joint position estimation and adaptive EMG-based impedance control of lower limb exoskeletonsCode0
Unsupervised Visuomotor Control through Distributional Planning NetworksCode0
State Space Closure: Revisiting Endless Online Level Generation via Reinforcement LearningCode0
Safe Multi-Agent Navigation guided by Goal-Conditioned Safe Reinforcement LearningCode0
MiWaves Reinforcement Learning AlgorithmCode0
Parameter Space Noise for ExplorationCode0
Recurrent Sum-Product-Max Networks for Decision Making in Perfectly-Observed EnvironmentsCode0
Recurrent Experience Replay in Distributed Reinforcement LearningCode0
RecSim: A Configurable Simulation Platform for Recommender SystemsCode0
WaveCorr: Correlation-savvy Deep Reinforcement Learning for Portfolio ManagementCode0
Revisiting Parameter Sharing in Multi-Agent Deep Reinforcement LearningCode0
Unveiling the Compositional Ability Gap in Vision-Language Reasoning ModelCode0
Noisy Zero-Shot Coordination: Breaking The Common Knowledge Assumption In Zero-Shot Coordination GamesCode0
Constrained Policy Improvement for Safe and Efficient Reinforcement LearningCode0
Safe Policy Optimization with Local Generalized Linear Function ApproximationsCode0
Modeling natural language emergence with integral transform theory and reinforcement learningCode0
WiNGPT-3.0 Technical ReportCode0
Statistical Inference in Reinforcement Learning: A Selective SurveyCode0
Statistical Inference of the Value Function for Reinforcement Learning in Infinite Horizon SettingsCode0
Statistically Efficient Advantage Learning for Offline Reinforcement Learning in Infinite HorizonsCode0
Noisy Networks for ExplorationCode0
Tiered Reinforcement Learning: Pessimism in the Face of Uncertainty and Constant RegretCode0
Multi-Goal Reinforcement Learning: Challenging Robotics Environments and Request for ResearchCode0
Recommender systems and reinforcement learning for human-building interaction and context-aware support: A text mining-driven review of scientific literatureCode0
Parameterized Projected Bellman OperatorCode0
Multi-Agent Trust Region Policy OptimizationCode0
Multiagent Rollout Algorithms and Reinforcement LearningCode0
Steady-State Error Compensation in Reference Tracking and Disturbance Rejection Problems for Reinforcement Learning-Based ControlCode0
Modeling Moral Choices in Social Dilemmas with Multi-Agent Reinforcement LearningCode0
Verifying Controllers Against Adversarial Examples with Bayesian OptimizationCode0
Tight Regret Bounds for Model-Based Reinforcement Learning with Greedy PoliciesCode0
Multi-Agent Reinforcement Learning Resources Allocation Method Using Dueling Double Deep Q-Network in Vehicular NetworksCode0
Tilted Quantile Gradient Updates for Quantile-Constrained Reinforcement LearningCode0
Safe reinforcement learning for probabilistic reachability and safety specifications: A Lyapunov-based approachCode0
Multi-Agent Reinforcement Learning for Power Grid Topology OptimizationCode0
Safe Reinforcement Learning From Pixels Using a Stochastic Latent RepresentationCode0
Modeling Explicit Concerning States for Reinforcement Learning in Visual DialogueCode0
Show:102550
← PrevPage 268 of 303Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified