SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 97519800 of 15113 papers

TitleStatusHype
Model-Free Non-Stationary RL: Near-Optimal Regret and Applications in Multi-Agent RL and Inventory Control0
Instance-Dependent Complexity of Contextual Bandits and Reinforcement Learning: A Disagreement-Based Perspective0
Episodic Reinforcement Learning in Finite MDPs: Minimax Lower Bounds Revisited0
Reward Machines: Exploiting Reward Function Structure in Reinforcement LearningCode1
Safety Aware Reinforcement Learning (SARL)0
Reinforcement Learning with Random DelaysCode1
UneVEn: Universal Value Exploration for Multi-Agent Reinforcement Learning0
Neural Mask Generator: Learning to Generate Adaptive Word Maskings for Language Model AdaptationCode1
Heterogeneous Multi-Agent Reinforcement Learning for Unknown Environment Mapping0
Learning Diverse Options via InfoMax Termination CriticCode0
Action Guidance: Getting the Best of Sparse Rewards and Shaped Rewards for Real-time Strategy GamesCode1
Interactive Fiction Game Playing as Multi-Paragraph Reading Comprehension with Reinforcement LearningCode1
Learning to Generalize for Sequential Decision MakingCode0
Goal-directed Generation of Discrete Structures with Conditional Generative Models0
A Reinforcement Learning Approach for Rebalancing Electric Vehicle Sharing SystemsCode1
Sentiment Analysis for Reinforcement Learning0
Meta-Learning of Structured Task Distributions in Humans and MachinesCode0
Policy Learning Using Weak SupervisionCode0
The act of remembering: a study in partially observable reinforcement learning0
Deep Reinforcement Learning for Collaborative Edge Computing in Vehicular Networks0
A Distributed Model-Free Ride-Sharing Approach for Joint Matching, Pricing, and Dispatching using Deep Reinforcement Learning0
Deep Reinforcement Learning for Electric Vehicle Routing Problem with Time Windows0
Test-Cost Sensitive Methods for Identifying Nearby Points0
A Sharp Analysis of Model-based Reinforcement Learning with Self-Play0
FORK: A Forward-Looking Actor For Model-Free Reinforcement LearningCode1
Disentangling causal effects for hierarchical reinforcement learning0
Attractor Selection in Nonlinear Energy Harvesting Using Deep Reinforcement Learning0
Beyond Tabula-Rasa: a Modular Reinforcement Learning Approach for Physically Embedded 3D Sokoban0
Mean-Variance Efficient Reinforcement Learning with Applications to Dynamic Financial Investment0
Interactive Reinforcement Learning for Feature Selection with Decision Tree in the Loop0
Reinforcement Learning of Sequential Price Mechanisms0
MADRaS : Multi Agent Driving Simulator0
Self-Play Reinforcement Learning for Fast Image RetargetingCode1
Goal-Auxiliary Actor-Critic for 6D Robotic Grasping with Point CloudsCode1
Exploration in Approximate Hyper-State Space for Meta Reinforcement LearningCode1
FOCAL: Efficient Fully-Offline Meta-Reinforcement Learning via Distance Metric Learning and Behavior RegularizationCode1
Recognition Method of Important Words in Korean Text based on Reinforcement Learning0
Multi-Reward based Reinforcement Learning for Neural Machine Translation0
Deep Reinforcement Learning with Mixed Convolutional Network0
Nearly Minimax Optimal Reinforcement Learning for Discounted MDPs0
Emergent Social Learning via Multi-agent Reinforcement Learning0
Bayesian Meta-reinforcement Learning for Traffic Signal Control0
Student-Initiated Action Advising via Advice NoveltyCode0
Bridging the gap between Markowitz planning and deep reinforcement learning0
AAMDRL: Augmented Asset Management with Deep Reinforcement Learning0
Deep Reinforcement Learning for Efficient Measurement of Quantum DevicesCode0
Finding It at Another Side: A Viewpoint-Adapted Matching Encoder for Change Captioning0
Accelerating Optimization and Reinforcement Learning with Quasi-Stochastic Approximation0
Learning to swim in potential flowCode1
Learning Rewards from Linguistic FeedbackCode1
Show:102550
← PrevPage 196 of 303Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified