SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 14011425 of 15113 papers

TitleStatusHype
Curriculum-based Reinforcement Learning for Distribution System Critical Load RestorationCode1
Curriculum Offline Imitation LearningCode1
Fashion Captioning: Towards Generating Accurate Descriptions with Semantic RewardsCode1
Extreme Q-Learning: MaxEnt RL without EntropyCode1
Learning by Doing: Controlling a Dynamical System using Causality, Control, and Reinforcement LearningCode1
D2RL: Deep Dense Architectures in Reinforcement LearningCode1
Decentralized Structural-RNN for Robot Crowd Navigation with Deep Reinforcement LearningCode1
Fast Adaptive Task Offloading in Edge Computing based on Meta Reinforcement LearningCode1
Data-Efficient Deep Reinforcement Learning for Attitude Control of Fixed-Wing UAVs: Field ExperimentsCode1
Learning, Fast and Slow: A Goal-Directed Memory-Based Approach for Dynamic EnvironmentsCode1
Fast Population-Based Reinforcement Learning on a Single MachineCode1
DataLight: Offline Data-Driven Traffic Signal ControlCode1
BabyAI 1.1Code1
A Workflow for Offline Model-Free Robotic Reinforcement LearningCode1
Accelerating Quadratic Optimization with Reinforcement LearningCode1
Expression might be enough: representing pressure and demand for reinforcement learning based traffic signal controlCode1
Learning Guidance Rewards with Trajectory-space SmoothingCode1
Zero-Shot Compositional Policy Learning via Language GroundingCode1
Actor-Attention-Critic for Multi-Agent Reinforcement LearningCode1
Debiased Contrastive LearningCode1
Debiasing Meta-Gradient Reinforcement Learning by Learning the Outer Value FunctionCode1
Learning Interpretable, High-Performing Policies for Autonomous DrivingCode1
Decision Transformer: Reinforcement Learning via Sequence ModelingCode1
Deceptive Path Planning via Reinforcement Learning with Graph Neural NetworksCode1
Batch Exploration with Examples for Scalable Robotic Reinforcement LearningCode1
Show:102550
← PrevPage 57 of 605Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified