SOTAVerified

Efficient Exploration

Efficient Exploration is one of the main obstacles in scaling up modern deep reinforcement learning algorithms. The main challenge in Efficient Exploration is the balance between exploiting current estimates, and gaining information about poorly understood states and actions.

Source: Randomized Value Functions via Multiplicative Normalizing Flows

Papers

Showing 351375 of 514 papers

TitleStatusHype
Bandit Algorithms for Tree Search0
BayesCNS: A Unified Bayesian Approach to Address Cold Start and Non-Stationarity in Search Systems at Scale0
Bayesian optimisation of large-scale photonic reservoir computers0
Bayesian optimization of distributed neurodynamical controller models for spatial navigation0
BBQ-Networks: Efficient Exploration in Deep Reinforcement Learning for Task-Oriented Dialogue Systems0
BBQ-Networks: Efficient Exploration in Deep Reinforcement Learning for Task-Oriented Dialogue Systems0
β-DQN: Improving Deep Q-Learning By Evolving the Behavior0
Better Exploration with Optimistic Actor-Critic0
Beyond Games: Bringing Exploration to Robots in Real-world0
Beyond Joint Demonstrations: Personalized Expert Guidance for Efficient Multi-Agent Reinforcement Learning0
Biased Estimates of Advantages over Path Ensembles0
BooVI: Provably Efficient Bootstrapped Value Iteration0
Braxlines: Fast and Interactive Toolkit for RL-driven Behavior Engineering beyond Reward Maximization0
CAE: Repurposing the Critic as an Explorer in Deep Reinforcement Learning0
Causal Information Prioritization for Efficient Reinforcement Learning0
CBOL-Tuner: Classifier-pruned Bayesian optimization to explore temporally structured latent spaces for particle accelerator tuning0
HelixMO: Sample-Efficient Molecular Optimization in Scene-Sensitive Latent Space0
CIM: Constrained Intrinsic Motivation for Sparse-Reward Continuous Control0
Clustered Reinforcement Learning0
Comprehensive decision-strategy space exploration for efficient territorial planning strategies0
Computational Discovery of Microstructured Composites with Optimal Stiffness-Toughness Trade-Offs0
Computing low-thrust transfers in the asteroid belt, a comparison between astrodynamical manipulations and a machine learning approach0
Co-NavGPT: Multi-Robot Cooperative Visual Semantic Navigation Using Vision Language Models0
Constrained Hybrid Metaheuristic Algorithm for Probabilistic Neural Networks Learning0
Context-Dependent Upper-Confidence Bounds for Directed Exploration0
Show:102550
← PrevPage 15 of 21Next →

No leaderboard results yet.