SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 1200112050 of 15113 papers

TitleStatusHype
LIMIS: Locally Interpretable Modeling using Instance-wise Subsampling0
Harnessing Structures for Value-Based Planning and Reinforcement LearningCode0
Scaling data-driven robotics with reward sketching and batch reinforcement learning0
Visual Exploration and Energy-aware Path Planning via Reinforcement LearningCode0
CAQL: Continuous Action Q-Learning0
MERL: Multi-Head Reinforcement Learning0
Learning to Reach Goals Without Reinforcement Learning0
City Metro Network Expansion with Reinforcement Learning0
Generalizing Reinforcement Learning to Unseen Actions0
Assessing Generalization in TD methods for Deep Reinforcement Learning0
Adapt-to-Learn: Policy Transfer in Reinforcement Learning0
AUGMENTED POLICY GRADIENT METHODS FOR EFFICIENT REINFORCEMENT LEARNING0
Learning by shaking: Computing policy gradients by physical forward-propagation0
Collaborative Inter-agent Knowledge Distillation for Reinforcement Learning0
Improving SAT Solver Heuristics with Graph Networks and Reinforcement Learning0
Counterfactual Regularization for Model-Based Reinforcement Learning0
Learning World Graph Decompositions To Accelerate Reinforcement Learning0
C-3PO: Cyclic-Three-Phase Optimization for Human-Robot Motion Retargeting based on Reinforcement LearningCode0
Learning Key Steps to Attack Deep Reinforcement Learning Agents0
CAPACITY-LIMITED REINFORCEMENT LEARNING: APPLICATIONS IN DEEP ACTOR-CRITIC METHODS FOR CONTINUOUS CONTROL0
Improving Exploration of Deep Reinforcement Learning using Planning for Policy Search0
Learning with Social Influence through Interior Policy Differentiation0
Learning to Reason: Distilling Hierarchy via Self-Supervision and Reinforcement Learning0
Consistent Meta-Reinforcement Learning via Model Identification and Experience Relabeling0
Learning Algorithmic Solutions to Symbolic Planning Tasks with a Neural Computer0
Data Valuation using Reinforcement LearningCode0
How many weights are enough : can tensor factorization learn efficient policies ?0
Efficient meta reinforcement learning via meta goal generation0
Learning Temporal Abstraction with Information-theoretic Constraints for Hierarchical Reinforcement Learning0
Do recent advancements in model-based deep reinforcement learning really improve data efficiency?0
Learning Good Policies By Learning Good Perceptual Models0
Long-term planning, short-term adjustments0
Attention Privileged Reinforcement Learning for Domain Transfer0
BANANAS: Bayesian Optimization with Neural Networks for Neural Architecture Search0
Behavior-Guided Reinforcement Learning0
Deep RL for Blood Glucose Control: Lessons, Challenges, and Opportunities0
Advantage Weighted Regression: Simple and Scalable Off-Policy Reinforcement Learning0
Hope For The Best But Prepare For The Worst: Cautious Adaptation In RL Agents0
Learning Functionally Decomposed Hierarchies for Continuous Navigation Tasks0
Event Discovery for History Representation in Reinforcement Learning0
Avoiding Negative Side-Effects and Promoting Safe Exploration with Imaginative Planning0
DeepAGREL: Biologically plausible deep learning via direct reinforcement0
Contextual Inverse Reinforcement Learning0
Evo-NAS: Evolutionary-Neural Hybrid Agent for Architecture Search0
HIPPOCAMPAL NEURONAL REPRESENTATIONS IN CONTINUAL LEARNING0
CrossNorm: On Normalization for Off-Policy Reinforcement Learning0
Towards Simplicity in Deep Reinforcement Learning: Streamlined Off-Policy Learning0
Trajectory representation learning for Multi-Task NMRDPs planning0
Risk Averse Value Expansion for Sample Efficient and Robust Policy Learning0
Multi-task Batch Reinforcement Learning with Metric Learning0
Show:102550
← PrevPage 241 of 303Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified