SOTAVerified

Efficient Exploration

Efficient Exploration is one of the main obstacles in scaling up modern deep reinforcement learning algorithms. The main challenge in Efficient Exploration is the balance between exploiting current estimates, and gaining information about poorly understood states and actions.

Source: Randomized Value Functions via Multiplicative Normalizing Flows

Papers

Showing 451475 of 514 papers

TitleStatusHype
Optimization by Pairwise Linkage Detection, Incremental Linkage Set, and Restricted / Back Mixing: DSMGA-II0
Learning to Interrupt: A Hierarchical Deep Reinforcement Learning Framework for Efficient Exploration0
New/s/leak 2.0 - Multilingual Information Extraction and Visualization for Investigative Journalism0
Near Optimal Exploration-Exploitation in Non-Communicating Markov Decision ProcessesCode0
Goal-oriented Trajectories for Efficient Exploration0
Curiosity Driven Exploration of Learned Disentangled Goal SpacesCode0
Efficient Gradient-Free Variational Inference using Policy SearchCode0
Multi-objective Model-based Policy Search for Data-efficient Learning with Sparse RewardsCode0
Scheduled Policy Optimization for Natural Language Communication with Intelligent AgentsCode0
Meta-Learning for Stochastic Gradient MCMCCode0
Randomized Value Functions via Multiplicative Normalizing FlowsCode0
A Web-scale system for scientific knowledge exploration0
When Simple Exploration is Sample Efficient: Identifying Sufficient Conditions for Random Exploration to Yield PAC RL Algorithms0
Efficient Exploration of Gradient Space for Online Learning to Rank0
Exploration by Distributional Reinforcement Learning0
A Human Mixed Strategy Approach to Deep Reinforcement Learning0
Variance Networks: When Expectation Does Not Meet Your ExpectationsCode0
Dimension-Robust MCMC in Bayesian Inverse Problems0
Efficient Exploration through Bayesian Deep Q-NetworksCode0
Diversity-Driven Exploration Strategy for Deep Reinforcement Learning0
Efficient Bias-Span-Constrained Exploration-Exploitation in Reinforcement LearningCode0
Federated Control with Hierarchical Multi-Agent Deep Reinforcement LearningCode0
The Eigenoption-Critic Framework0
Reinforced dynamics for enhanced sampling in large atomic and molecular systems0
Noisy Natural Gradient as Variational InferenceCode0
Show:102550
← PrevPage 19 of 21Next →

No leaderboard results yet.