SOTAVerified

Efficient Exploration

Efficient Exploration is one of the main obstacles in scaling up modern deep reinforcement learning algorithms. The main challenge in Efficient Exploration is the balance between exploiting current estimates, and gaining information about poorly understood states and actions.

Source: Randomized Value Functions via Multiplicative Normalizing Flows

Papers

Showing 376400 of 514 papers

TitleStatusHype
Bayesian optimisation of large-scale photonic reservoir computers0
Weakly-Supervised Reinforcement Learning for Controllable Behavior0
Provably Efficient Exploration for Reinforcement Learning Using Unsupervised LearningCode0
Active Model Estimation in Markov Decision Processes0
Efficient Exploration in Constrained Environments with Goal-Oriented Reference Path0
Efficient exploration of zero-sum stochastic games0
Particle Filter Based Monocular Human Tracking with a 3D Cardbox Model and a Novel Deterministic Resampling Strategy0
Misspecification-robust likelihood-free inference in high dimensions0
Minimax Value Interval for Off-Policy Evaluation and Policy Optimization0
GLIB: Efficient Exploration for Relational Model-Based Reinforcement Learning via Goal-Literal BabblingCode0
Parameterized Indexed Value Function for Efficient Exploration in Reinforcement LearningCode0
Recruitment-imitation Mechanism for Evolutionary Reinforcement Learning0
Provably Efficient Exploration in Policy Optimization0
Explicit Planning for Efficient Exploration in Reinforcement Learning0
Better Exploration with Optimistic Actor CriticCode0
Comprehensive decision-strategy space exploration for efficient territorial planning strategies0
Scaling active inference0
Bayesian Curiosity for Efficient Exploration in Reinforcement LearningCode0
Implicit Generative Modeling for Efficient Exploration0
Efficient Exploration through Intrinsic Motivation Learning for Unsupervised Subgoal Discovery in Model-Free Hierarchical Reinforcement Learning0
Multi-Path Policy Optimization0
MAME : Model-Agnostic Meta-Exploration0
Neural Contextual Bandits with UCB-based ExplorationCode0
Structured exploration in the finite horizon linear quadratic dual control problem0
VASE: Variational Assorted Surprise Exploration for Reinforcement Learning0
Show:102550
← PrevPage 16 of 21Next →

No leaderboard results yet.