SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 1410114150 of 15113 papers

TitleStatusHype
Combination of Supervised and Reinforcement Learning For Vision-Based Autonomous Control0
Learning Dynamic State Abstractions for Model-Based Reinforcement Learning0
Learning an Embedding Space for Transferable Robot Skills0
Learning to Treat Sepsis with Multi-Output Gaussian Process Deep Recurrent Q-Networks0
Autonomous Vehicle Fleet Coordination With Deep Reinforcement Learning0
Avoiding Catastrophic States with Intrinsic Fear0
Representing Entropy : A short proof of the equivalence between soft Q-learning and policy gradients0
NerveNet: Learning Structured Policy with Graph Neural NetworksCode0
Reinforcement Learning via Replica Stacking of Quantum Measurements for the Training of Quantum Boltzmann Machines0
Policy Gradient For Multidimensional Action Spaces: Action Sampling and Entropy Bonus0
Neural Task Graph Execution0
Universal Agent for Disentangling Environments and Tasks0
Model-based imitation learning from state trajectories0
Predicting Multiple Actions for Stochastic Continuous Control0
Neuron as an Agent0
Using Deep Reinforcement Learning to Generate Rationales for Molecules0
Residual Loss Prediction: Reinforcement Learning With No Incremental FeedbackCode0
LSD-Net: Look, Step and Detect for Joint Navigation and Multi-View Recognition with Deep Reinforcement Learning0
Reward Estimation via State Prediction0
Now I Remember! Episodic Memory For Reinforcement Learning0
Learning Structural Weight Uncertainty for Sequential Decision-MakingCode0
Deep Reinforcement Learning for Unsupervised Video Summarization with Diversity-Representativeness RewardCode0
SBEED: Convergent Reinforcement Learning with Nonlinear Function Approximation0
Multi-timescale memory dynamics in a reinforcement learning network with attention-gated memoryCode0
Reinforcement Learning with Analogical Similarity to Guide Schema Induction and Attention0
Consensus-based Sequence Training for Video Captioning0
A short variational proof of equivalence between policy gradients and soft Q learning0
Least-Squares Temporal Difference Learning for the Linear Quadratic Regulator0
Federated Control with Hierarchical Multi-Agent Deep Reinforcement LearningCode0
Multiagent-based Participatory Urban Simulation through Inverse Reinforcement Learning0
Recurrent Attentional Reinforcement Learning for Multi-label Image Recognition0
Revisiting the Master-Slave Architecture in Multi-Agent Deep Reinforcement Learning0
Pseudorehearsal in actor-critic agents with neural network function approximation0
Hierarchical and Interpretable Skill Acquisition in Multi-task Reinforcement Learning0
Two-dimensional Anti-jamming Mobile Communication Based on Reinforcement Learning0
On Wasserstein Reinforcement Learning and the Fokker-Planck equation0
On the Relationship Between the OpenAI Evolution Strategy and Stochastic Gradient Descent0
ES Is More Than Just a Traditional Finite-Difference Approximator0
Improving Exploration in Evolution Strategies for Deep Reinforcement Learning via a Population of Novelty-Seeking AgentsCode0
Deep Neuroevolution: Genetic Algorithms Are a Competitive Alternative for Training Deep Neural Networks for Reinforcement LearningCode0
Integral Equations and Machine Learning0
Towards a Deep Reinforcement Learning Approach for Tower Line Wars0
Occam's razor is insufficient to infer the preferences of irrational agents0
Hierarchical Text Generation and Planning for Strategic DialogueCode0
Differentiable lower bound for expected BLEU scoreCode0
Inverse Reinforcement Learning for Marketing0
QLBS: Q-Learner in the Black-Scholes(-Merton) WorldsCode0
Multi-focus Attention Network for Efficient Deep Reinforcement Learning0
Simulated Autonomous Driving on Realistic Road Networks using Deep Reinforcement Learning0
Deep Reinforcement Learning Boosted by External Knowledge0
Show:102550
← PrevPage 283 of 303Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified