SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 1470114750 of 15113 papers

TitleStatusHype
Cross-View Policy Learning for Street NavigationCode0
Learning State Representations from Random Deep Action-conditional PredictionsCode0
GLIB: Efficient Exploration for Relational Model-Based Reinforcement Learning via Goal-Literal BabblingCode0
A0C: Alpha Zero in Continuous Action SpaceCode0
Learning State Representations via Retracing in Reinforcement LearningCode0
Energy-Efficient Parking Analytics System using Deep Reinforcement LearningCode0
Energy-Efficient Thermal Comfort Control in Smart Buildings via Deep Reinforcement LearningCode0
Global and Local Analysis of Interestingness for Competency-Aware Deep Reinforcement LearningCode0
Cross-Trajectory Representation Learning for Zero-Shot Generalization in RLCode0
Crossmodal Attentive Skill LearnerCode0
A Scavenger Hunt for Service RobotsCode0
Artificial Intelligence for Prosthetics - challenge solutionsCode0
Cross-domain Random Pre-training with Prototypes for Reinforcement LearningCode0
Globally Optimal Hierarchical Reinforcement Learning for Linearly-Solvable Markov Decision ProcessesCode0
Learning Dynamic Context Augmentation for Global Entity LinkingCode0
Blockwise Sequential Model Learning for Partially Observable Reinforcement LearningCode0
Blind Inpainting of Large-scale Masks of Thin Structures with Adversarial and Reinforcement LearningCode0
CROP: Towards Distributional-Shift Robust Reinforcement Learning using Compact Reshaped Observation ProcessingCode0
Blackout Mitigation via Physics-guided RLCode0
Enforcing Almost-Sure Reachability in POMDPsCode0
Black-Box Data-efficient Policy Search for RoboticsCode0
Bipedal Walking Robot using Deep Deterministic Policy GradientCode0
Biologically Plausible Variational Policy Gradient with Spiking Recurrent Winner-Take-All NetworksCode0
CROP: Certifying Robust Policies for Reinforcement Learning through Functional SmoothingCode0
Is Deep Reinforcement Learning Really Superhuman on Atari? Leveling the playing fieldCode0
A Comparison of Reinforcement Learning Frameworks for Software Testing TasksCode0
BindsNET: A machine learning-oriented spiking neural networks library in PythonCode0
Goal-conditioned Imitation LearningCode0
Navigation Agents for the Visually Impaired: A Sidewalk Simulator and ExperimentsCode0
Improving Generalization on the ProcGen Benchmark with Simple Architectural Changes and ScaleCode0
Goal-Conditioned Q-Learning as Knowledge DistillationCode0
BiERL: A Meta Evolutionary Reinforcement Learning Framework via Bilevel OptimizationCode0
Self-Attentional Credit Assignment for Transfer in Reinforcement LearningCode0
Is Feedback All You Need? Leveraging Natural Language Feedback in Goal-Conditioned Reinforcement LearningCode0
Learning Structural Weight Uncertainty for Sequential Decision-MakingCode0
CRC-RL: A Novel Visual Feature Representation Architecture for Unsupervised Reinforcement LearningCode0
An Empirical Evaluation of Posterior Sampling for Constrained Reinforcement LearningCode0
Crawling in Rogue's dungeons with (partitioned) A3CCode0
Improving Image Captioning with Conditional Generative Adversarial NetsCode0
Improving Information Extraction by Acquiring External Evidence with Reinforcement LearningCode0
5G Routing Interfered EnvironmentCode0
Enhancing Commentary Strategies for Imperfect Information Card Games: A Study of Large Language Models in Guandan CommentaryCode0
Goal Exploration Augmentation via Pre-trained Skills for Sparse-Reward Long-Horizon Goal-Conditioned Reinforcement LearningCode0
Crafting desirable climate trajectories with RL explored socio-environmental simulationsCode0
BF++: a language for general-purpose program synthesisCode0
Crafting a Toolchain for Image Restoration by Deep Reinforcement LearningCode0
Crafting a Pogo Stick in Minecraft with Heuristic Search (Extended Abstract)Code0
Course Recommender Systems Need to Consider the Job MarketCode0
Countering Reward Over-optimization in LLM with Demonstration-Guided Reinforcement LearningCode0
Adaptive Power System Emergency Control using Deep Reinforcement LearningCode0
Show:102550
← PrevPage 295 of 303Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified