SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 90519100 of 15113 papers

TitleStatusHype
A Survey on Reinforcement Learning-Aided Caching in Mobile Edge Networks0
Adversarial Reinforcement Learning in Dynamic Channel Access and Power Control0
Acting upon Imagination: when to trust imagined trajectories in model based reinforcement learning0
Composable Energy Policies for Reactive Motion Generation and Reinforcement Learning0
Hierarchical RNNs-Based Transformers MADDPG for Mixed Cooperative-Competitive Environments0
Zero-Shot Reinforcement Learning on Graphs for Autonomous Exploration Under Uncertainty0
Return-based Scaling: Yet Another Normalisation Trick for Deep RL0
Reinforcement learning of rare diffusive dynamics0
Parameter-free Gradient Temporal Difference Learning0
PEARL: Parallelized Expert-Assisted Reinforcement Learning for Scene Rearrangement Planning0
Efficient Self-Supervised Data Collection for Offline Robot Learning0
Age of Information Aware VNF Scheduling in Industrial IoT Using Deep Reinforcement Learning0
A Deep Reinforcement Learning Approach to Audio-Based Navigation in a Multi-Speaker EnvironmentCode0
Dynamic Multichannel Access via Multi-agent Reinforcement Learning: Throughput and Fairness Guarantees0
Adaptive Policy Transfer in Reinforcement Learning0
Improving Cost Learning for JPEG Steganography by Exploiting JPEG Domain Knowledge0
Reinforcement Learning with Expert Trajectory For Quantitative Trading0
A parallel-network continuous quantitative trading model with GARCH and PPO0
Scalable, Decentralized Multi-Agent Reinforcement Learning Methods Inspired by Stigmergy and Ant Colonies0
RAIL: A modular framework for Reinforcement-learning-based Adversarial Imitation Learning0
Utilizing Skipped Frames in Action Repeats via Pseudo-Actions0
Using reinforcement learning to design an AI assistantfor a satisfying co-op experience0
Reward prediction for representation learning and reward shaping0
Time-Aware Q-Networks: Resolving Temporal Irregularity for Deep Reinforcement Learning0
A Reinforcement Learning-based Economic Model Predictive Control Framework for Autonomous Operation of Chemical Reactors0
Deep Graph Convolutional Reinforcement Learning for Financial Portfolio Management -- DeepPocket0
Learning Algorithms for Regenerative Stopping Problems with Applications to Shipping Consolidation in Logistics0
Solving Sokoban with forward-backward reinforcement learning0
Safety Enhancement for Deep Reinforcement Learning in Autonomous Separation Assurance0
Survey on Multi-Agent Q-Learning frameworks for resource management in wireless sensor network0
UVIP: Model-Free Approach to Evaluate Reinforcement Learning AlgorithmsCode0
Reinforcement Learning for Scalable Logic Optimization with Graph Neural Networks0
On Lottery Tickets and Minimal Task Representations in Deep Reinforcement Learning0
On the Linear convergence of Natural Policy Gradient Algorithm0
Data-Efficient Reinforcement Learning for Malaria Control0
Generative Adversarial Reward Learning for Generalized Behavior Tendency Inference0
Hierarchical Reinforcement Learning for Air-to-Air Combat0
Learning swimming escape patterns for larval fish under energy constraints0
Robotic Surgery With Lean Reinforcement LearningCode0
Reinforcement Learning for Ridesharing: An Extended Survey0
Reducing Bus Bunching with Asynchronous Multi-Agent Reinforcement Learning0
Curious Exploration and Return-based Memory Restoration for Deep Reinforcement LearningCode0
BACKDOORL: Backdoor Attack against Competitive Reinforcement Learning0
InferNet for Delayed Reinforcement Tasks: Addressing the Temporal Credit Assignment Problem0
CARL-DTN: Context Adaptive Reinforcement Learning based Routing Algorithm in Delay Tolerant Network0
Better than the Best: Gradient-based Improper Reinforcement Learning for Network Scheduling0
Nearest-Neighbor-based Collision Avoidance for Quadrotors via Reinforcement Learning0
Discrete-Time Mean Field Control with Environment States0
Mitigating Political Bias in Language Models Through Reinforced Calibration0
Mean Field MARL Based Bandwidth Negotiation Method for Massive Devices Spectrum Sharing0
Show:102550
← PrevPage 182 of 303Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified