SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 1385113900 of 15113 papers

TitleStatusHype
Instance Weighted Incremental Evolution Strategies for Reinforcement Learning in Dynamic EnvironmentsCode0
Deterministic Uncertainty Propagation for Improved Model-Based Offline Reinforcement LearningCode0
A Dual Reinforcement Learning Framework for Unsupervised Text Style TransferCode0
Safety Augmented Value Estimation from Demonstrations (SAVED): Safe Deep Model-Based RL for Sparse Cost Robotic TasksCode0
DeLF: Designing Learning Environments with Foundation ModelsCode0
Amplifying the Imitation Effect for Reinforcement Learning of UCAV's Mission ExecutionCode0
Extending Environments To Measure Self-Reflection In Reinforcement LearningCode0
Auto-Pipeline: Synthesizing Complex Data Pipelines By-Target Using Reinforcement Learning and SearchCode0
External Model Motivated Agents: Reinforcement Learning for Enhanced Environment SamplingCode0
Let's Play Again: Variability of Deep Reinforcement Learning Agents in Atari EnvironmentsCode0
Developing parsimonious ensembles using predictor diversity within a reinforcement learning frameworkCode0
Defending Observation Attacks in Deep Reinforcement Learning via Detection and DenoisingCode0
Hierarchical Reinforcement Learning for Concurrent Discovery of Compound and Composable PoliciesCode0
Development of a PPO-Reinforcement Learned Walking Tripedal Soft-Legged Robot using SOFACode0
A Monte Carlo AIXI ApproximationCode0
Learning to Compose Neural Networks for Question AnsweringCode0
Device Placement Optimization with Reinforcement LearningCode0
Dex: Incremental Learning for Complex Environments in Deep Reinforcement LearningCode0
Conservative Bayesian Model-Based Value Expansion for Offline Policy OptimizationCode0
Extrapolating Beyond Suboptimal Demonstrations via Inverse Reinforcement Learning from ObservationsCode0
Learning to Solve Voxel Building Embodied Tasks from Pixels and Natural Language InstructionsCode0
Conservative and Risk-Aware Offline Multi-Agent Reinforcement LearningCode0
Imitating from auxiliary imperfect demonstrations via Adversarial Density Weighted RegressionCode0
Deep W-Networks: Solving Multi-Objective Optimisation Problems With Deep Reinforcement LearningCode0
Deep Visual Foresight for Planning Robot MotionCode0
Policy Iterations for Reinforcement Learning Problems in Continuous Time and Space -- Fundamental Theory and MethodsCode0
DHER: Hindsight Experience Replay for Dynamic GoalsCode0
Deep Variation-structured Reinforcement Learning for Visual Relationship and Attribute DetectionCode0
Diagnosing Bottlenecks in Deep Q-learning AlgorithmsCode0
A Model-Based Reinforcement Learning with Adversarial Training for Online RecommendationCode0
Learning to Control Autonomous Fleets from Observation via Offline Reinforcement LearningCode0
Dialog-based Interactive Image RetrievalCode0
MEDIRL: Predicting the Visual Attention of Drivers via Maximum Entropy Deep Inverse Reinforcement LearningCode0
Deep Variational Reinforcement Learning for POMDPsCode0
Dialogue Generation: From Imitation Learning to Inverse Reinforcement LearningCode0
Deep Transfer Reinforcement Learning for Text SummarizationCode0
Dialogue Learning with Human Teaching and Feedback in End-to-End Trainable Task-Oriented Dialogue SystemsCode0
Learning Invariances for Policy GeneralizationCode0
Approximately Optimal Search on a Higher-dimensional Sliding PuzzleCode0
DeepTraffic: Crowdsourced Hyperparameter Tuning of Deep Reinforcement Learning Systems for Multi-Agent Dense Traffic NavigationCode0
Model-Based Reinforcement Learning with Adversarial Training for Online RecommendationCode0
Autonomous Soft Tissue Retraction Using Demonstration-Guided Reinforcement LearningCode0
Did we personalize? Assessing personalization by an online reinforcement learning algorithm using resamplingCode0
Learning to Steer Markovian Agents under Model UncertaintyCode0
Hierarchical Reinforcement Learning for Zero-shot Generalization with Subtask DependenciesCode0
Adjust Planning Strategies to Accommodate Reinforcement Learning AgentsCode0
Learning to Control in Metric Space with Optimal RegretCode0
Integrating Contrastive Learning with Dynamic Models for Reinforcement Learning from ImagesCode0
Applying Deep Reinforcement Learning to the HP Model for Protein Structure PredictionCode0
Hierarchical Reinforcement Learning via Advantage-Weighted Information MaximizationCode0
Show:102550
← PrevPage 278 of 303Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified