SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 39513975 of 15113 papers

TitleStatusHype
ASQ-IT: Interactive Explanations for Reinforcement-Learning Agents0
Minimal Value-Equivalent Partial Models for Scalable and Robust Planning in Lifelong Reinforcement Learning0
Constrained Reinforcement Learning for Dexterous ManipulationCode0
Explainable Deep Reinforcement Learning: State of the Art and Challenges0
Forecaster-aided User Association and Load Balancing in Multi-band Mobile Networks0
Model Based Reinforcement Learning with Non-Gaussian Environment Dynamics and its Application to Portfolio Optimization0
Learning to View: Decision Transformers for Active Object Detection0
The configurable tree graph (CT-graph): measurable problems in partially observable and distal reward environments for lifelong reinforcement learningCode0
Quasi-optimal Reinforcement Learning with Continuous Actions0
Reinforcement learning-based estimation for partial differential equations0
Multi-agent Reinforcement Learning with Graph Q-Networks for Antenna Tuning0
Generative Slate Recommendation with Reinforcement Learning0
Multi-Armed Bandits and Quantum Channel Oracles0
Modeling Moral Choices in Social Dilemmas with Multi-Agent Reinforcement LearningCode0
Asynchronous Deep Double Duelling Q-Learning for Trading-Signal Execution in Limit Order Book Markets0
Revisiting Estimation Bias in Policy Gradients for Deep Reinforcement Learning0
Generalization through Diversity: Improving Unsupervised Environment Design0
Domain-adapted Learning and Imitation: DRL for Power Arbitrage0
Domain-adapted Learning and Interpretability: DRL for Gas Trading0
A Survey of Meta-Reinforcement Learning0
Tight Guarantees for Interactive Decision Making with the Decision-Estimation Coefficient0
Advanced Scaling Methods for VNF deployment with Reinforcement Learning0
Human-Timescale Adaptation in an Open-Ended Task Space0
Multi-compartment Neuron and Population Encoding Powered Spiking Neural Network for Deep Distributional Reinforcement Learning0
PIRLNav: Pretraining with Imitation and RL Finetuning for ObjectNavCode1
Show:102550
← PrevPage 159 of 605Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified