SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 1240112450 of 15113 papers

TitleStatusHype
DeepMDP: Learning Continuous Latent Space Models for Representation Learning0
Deep Reinforcement Learning for Multi-objective Optimization0
Combining Reinforcement Learning and Configuration Checking for Maximum k-plex Problem0
Measurement-based Online Available Bandwidth Estimation employing Reinforcement Learning0
Reinforcement Learning When All Actions are Not Always AvailableCode0
Probabilistic hypergraph grammars for efficient molecular optimization0
Risk-Sensitive Compact Decision Trees for Autonomous Execution in Presence of Simulated Market Response0
Exploration with Unreliable Intrinsic Reward in Multi-Agent Reinforcement Learning0
Deep Q-Learning for Directed Acyclic Graph Generation0
Continuous Control for Automated Lane Change Behavior Based on Deep Deterministic Policy Gradient Algorithm0
On-board Deep Q-Network for UAV-assisted Online Power Transfer and Data Collection0
Simultaneous Translation with Flexible Policy via Restricted Imitation Learning0
Posterior Variance Analysis of Gaussian Processes with Application to Average Learning Curves0
Options as responses: Grounding behavioural hierarchies in multi-agent RL0
Reinforcement Learning with Low-Complexity Liquid State MachinesCode0
Autonomous Reinforcement Learning of Multiple Interrelated Tasks0
Robust exploration in linear quadratic reinforcement learningCode0
Off-Policy Evaluation via Off-Policy Classification0
Reconstruct and Represent Video Contents for Captioning via Reinforcement Learning0
RL-Based Method for Benchmarking the Adversarial Resilience and Robustness of Deep Reinforcement Learning Policies0
Sequential Triggers for Watermarking of Deep Reinforcement Learning Policies0
Proximal Reliability Optimization for Reinforcement Learning0
Adversarial Exploitation of Policy Imitation0
Deep Reinforcement Learning Architecture for Continuous Power Allocation in High Throughput Satellites0
Learning to solve the credit assignment problemCode0
Load Balancing for Ultra-Dense Networks: A Deep Reinforcement Learning Based Approach0
A Semi-Supervised Approach for Low-Resourced Text GenerationCode0
Using a Logarithmic Mapping to Enable Lower Discount Factors in Reinforcement LearningCode0
Decentralized Deep Reinforcement Learning for Delay-Power Tradeoff in Vehicular Communications0
The Principle of Unchanged Optimality in Reinforcement Learning Generalization0
Automated Video Game Testing Using Synthetic and Human-Like Agents0
Learner-aware Teaching: Inverse Reinforcement Learning with Preferences and Constraints0
An Empirical Study on Hyperparameters and their Interdependence for RL Generalization0
On the Correctness and Sample Complexity of Inverse Reinforcement LearningCode0
Air Learning: A Deep Reinforcement Learning Gym for Autonomous Aerial Robot Visual NavigationCode0
Exploiting Noisy Data in Distant Supervision Relation Classification0
Decision-Making in Reinforcement Learning0
Harnessing Reinforcement Learning for Neural Motion PlanningCode0
Enhanced Bayesian Compression via Deep Reinforcement Learning0
Language-Driven Temporal Activity Localization: A Semantic Matching Reinforcement Learning Model0
Safety Augmented Value Estimation from Demonstrations (SAVED): Safe Deep Model-Based RL for Sparse Cost Robotic TasksCode0
Interval timing in deep reinforcement learning agentsCode0
Attentional Policies for Cross-Context Multi-Agent Reinforcement Learning0
Reinforcement Learning Experience Reuse with Policy Residual Representation0
Sequence Modeling of Temporal Credit Assignment for Episodic Reinforcement LearningCode0
Rewarding Smatch: Transition-Based AMR Parsing with Reinforcement Learning0
On Value Functions and the Agent-Environment Boundary0
Reinforcement Learning for Mean Field Game0
Don't Forget Your Teacher: A Corrective Reinforcement Learning Framework0
Combating the Compounding-Error Problem with a Multi-step Model0
Show:102550
← PrevPage 249 of 303Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified