SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 15511600 of 15113 papers

TitleStatusHype
Benchmarking Multi-Agent Deep Reinforcement Learning Algorithms in Cooperative TasksCode1
A Deep Reinforced Model for Abstractive SummarizationCode1
Agents that Listen: High-Throughput Reinforcement Learning with Multiple Sensory SystemsCode1
Learning with AMIGo: Adversarially Motivated Intrinsic GoalsCode1
Comparing Observation and Action Representations for Deep Reinforcement Learning in μRTSCode1
Automatic Data Augmentation for Generalization in Deep Reinforcement LearningCode1
Towards Real-World Deployment of Reinforcement Learning for Traffic Signal ControlCode1
Lenient Multi-Agent Deep Reinforcement LearningCode1
Leveraging Factored Action Spaces for Efficient Offline Reinforcement Learning in HealthcareCode1
CommonPower: A Framework for Safe Data-Driven Smart Grid ControlCode1
Agent with Warm Start and Active Termination for Plane Localization in 3D UltrasoundCode1
Leveraging Symmetry to Accelerate Learning of Trajectory Tracking Controllers for Free-Flying Robotic SystemsCode1
Agent with Warm Start and Adaptive Dynamic Termination for Plane Localization in 3D UltrasoundCode1
LibSignal: An Open Library for Traffic Signal ControlCode1
Lifelong Incremental Reinforcement Learning with Online Bayesian InferenceCode1
Lifelong Machine Learning of Functionally Compositional StructuresCode1
A2C is a special case of PPOCode1
A Cooperative Multi-Agent Reinforcement Learning Framework for Resource Balancing in Complex Logistics NetworkCode1
Communicative Reinforcement Learning Agents for Landmark Detection in Brain ImagesCode1
LOA: Logical Optimal Actions for Text-based Interaction GamesCode1
Logic and the 2-Simplicial TransformerCode1
Comparing Popular Simulation Environments in the Scope of Robotics and Reinforcement LearningCode1
Compound AI Systems Optimization: A Survey of Methods, Challenges, and Future DirectionsCode1
Combining Reinforcement Learning with Lin-Kernighan-Helsgaun Algorithm for the Traveling Salesman ProblemCode1
Low-Rank Modular Reinforcement Learning via Muscle SynergyCode1
LS-IQ: Implicit Reward Regularization for Inverse Reinforcement LearningCode1
Lyapunov Barrier Policy OptimizationCode1
Lyapunov-Regularized Reinforcement Learning for Power System Transient StabilityCode1
An Equivalence between Loss Functions and Non-Uniform Sampling in Experience ReplayCode1
Combining Reinforcement Learning with Model Predictive Control for On-Ramp MergingCode1
Combining Modular Skills in Multitask LearningCode1
MADE: Exploration via Maximizing Deviation from Explored RegionsCode1
MAMBPO: Sample-efficient multi-robot reinforcement learning using learned world modelsCode1
Managing power grids through topology actions: A comparative study between advanced rule-based and reinforcement learning agentsCode1
Abstract-to-Executable Trajectory Translation for One-Shot Task GeneralizationCode1
Combining Reinforcement Learning and Constraint Programming for Combinatorial OptimizationCode1
Market-making with reinforcement-learning (SAC)Code1
MARLeME: A Multi-Agent Reinforcement Learning Model Extraction LibraryCode1
Reinforcement Learning for Combining Search Methods in the Calibration of Economic ABMsCode1
MarsExplorer: Exploration of Unknown Terrains via Deep Reinforcement Learning and Procedurally Generated EnvironmentsCode1
Combinatorial Optimization with Policy Adaptation using Latent Space SearchCode1
An End-to-End Reinforcement Learning Approach for Job-Shop Scheduling Problems Based on Constraint ProgrammingCode1
Combining Deep Reinforcement Learning and Search for Imperfect-Information GamesCode1
An End-to-end Deep Reinforcement Learning Approach for the Long-term Short-term Planning on the Frenet SpaceCode1
Maximum a Posteriori Policy OptimisationCode1
Maximum Entropy Gain Exploration for Long Horizon Multi-goal Reinforcement LearningCode1
A Sustainable Ecosystem through Emergent Cooperation in Multi-Agent Reinforcement LearningCode1
A SWAT-based Reinforcement Learning Framework for Crop ManagementCode1
Combinatorial Optimization by Graph Pointer Networks and Hierarchical Reinforcement LearningCode1
Learning to combine primitive skills: A step towards versatile robotic manipulationCode1
Show:102550
← PrevPage 32 of 303Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified