SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 1410114150 of 15113 papers

TitleStatusHype
Accept Synthetic Objects as Real: End-to-End Training of Attentive Deep Visuomotor Policies for Manipulation in ClutterCode0
Distributed-Training-and-Execution Multi-Agent Reinforcement Learning for Power Control in HetNetCode0
Distributed Transmission Control for Wireless Networks using Multi-Agent Reinforcement LearningCode0
Competitive Multi-agent Inverse Reinforcement Learning with Sub-optimal DemonstrationsCode0
Automated Discovery of Local Rules for Desired Collective-Level Behavior Through Reinforcement LearningCode0
How to Build User Simulators to Train RL-based Dialog SystemsCode0
Learning to Execute: Efficient Learning of Universal Plan-Conditioned Policies in RoboticsCode0
Distributional constrained reinforcement learning for supply chain optimizationCode0
Towards Abstractive Timeline Summarisation using Preference-based Reinforcement LearningCode0
AIXIjs: A Software Demo for General Reinforcement LearningCode0
A Novel Approach to Curiosity and Explainable Reinforcement Learning via Interpretable Sub-GoalsCode0
A Deep Multi-Agent Reinforcement Learning Approach to Autonomous Separation AssuranceCode0
Financial Trading as a Game: A Deep Reinforcement Learning ApproachCode0
How to Make Deep RL Work in PracticeCode0
Distributionally Robust Off-Dynamics Reinforcement Learning: Provable Efficiency with Linear Function ApproximationCode0
Leveraging class abstraction for commonsense reinforcement learning via residual policy gradient methodsCode0
How to pick the domain randomization parameters for sim-to-real transfer of reinforcement learning policies?Code0
Competing for pixels: a self-play algorithm for weakly-supervised segmentationCode0
How to Sense the World: Leveraging Hierarchy in Multimodal Perception for Robust Reinforcement Learning AgentsCode0
Comparison of Reinforcement Learning algorithms applied to the Cart Pole problemCode0
Interactive Query-Assisted Summarization via Deep Reinforcement LearningCode0
Distributional Reinforcement Learning for Energy-Based Sequential ModelsCode0
Learning Multimodal Transition Dynamics for Model-Based Reinforcement LearningCode0
Learning Multi-Objective Curricula for Robotic Policy LearningCode0
Distributional Reinforcement Learning for Multi-Dimensional Reward FunctionsCode0
A Deeper Look at Experience ReplayCode0
Distributional Reinforcement Learning with Regularized Wasserstein LossCode0
Learning Multiresolution Matrix Factorization and its Wavelet Networks on GraphsCode0
Distributional Reinforcement Learning with Quantile RegressionCode0
Leveraging Demonstrations for Deep Reinforcement Learning on Robotics Problems with Sparse RewardsCode0
Deep Reinforcement Learning for Multi-Domain Dialogue SystemsCode0
Deep Reinforcement Learning for Multi-class Imbalanced TrainingCode0
AC-Teach: A Bayesian Actor-Critic Method for Policy Learning with an Ensemble of Suboptimal TeachersCode0
HRL4IN: Hierarchical Reinforcement Learning for Interactive Navigation with Mobile ManipulatorsCode0
Comparison of Model-Free and Model-Based Learning-Informed Planning for PointGoal NavigationCode0
Learning Natural Language Generation with Truncated Reinforcement LearningCode0
Distributional Reward Estimation for Effective Multi-Agent Deep Reinforcement LearningCode0
HTMRL: Biologically Plausible Reinforcement Learning with Hierarchical Temporal MemoryCode0
Automated Curriculum Learning by Rewarding Temporally Rare EventsCode0
Interactive Semantic Parsing for If-Then Recipes via Hierarchical Reinforcement LearningCode0
A Critical Investigation of Deep Reinforcement Learning for NavigationCode0
Combining Reinforcement Learning and Optimal Transport for the Traveling Salesman ProblemCode0
DistSPECTRL: Distributing Specifications in Multi-Agent Reinforcement Learning SystemsCode0
Latent Guided Sampling for Combinatorial OptimizationCode0
Deep Reinforcement Learning for Mention-Ranking Coreference ModelsCode0
Combining Automated Optimisation of Hyperparameters and Reward ShapeCode0
Automata Learning meets ShieldingCode0
Combined Reinforcement Learning via Abstract RepresentationsCode0
Latent Intention Dialogue ModelsCode0
Learning Diverse Options via InfoMax Termination CriticCode0
Show:102550
← PrevPage 283 of 303Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified