SOTAVerified

Efficient Exploration

Efficient Exploration is one of the main obstacles in scaling up modern deep reinforcement learning algorithms. The main challenge in Efficient Exploration is the balance between exploiting current estimates, and gaining information about poorly understood states and actions.

Source: Randomized Value Functions via Multiplicative Normalizing Flows

Papers

Showing 251275 of 514 papers

TitleStatusHype
The split Gibbs sampler revisited: improvements to its algorithmic structure and augmented target distributionCode0
Curious Exploration via Structured World Models Yields Zero-Shot Object Manipulation0
A Langevin-like Sampler for Discrete DistributionsCode1
Scalable Exploration for Neural Online Learning to Rank with Perturbed Feedback0
On Preemption and Learning in Stochastic SchedulingCode0
Sample-Efficient, Exploration-Based Policy Optimisation for Routing Problems0
Learning Math Reasoning from Self-Sampled Correct and Partially-Correct SolutionsCode1
Learning to Solve Combinatorial Graph Partitioning Problems via Efficient ExplorationCode1
Personalized Algorithmic Recourse with Preference ElicitationCode0
SFP: State-free Priors for Exploration in Off-Policy Reinforcement Learning0
The Sufficiency of Off-Policyness and Soft Clipping: PPO is still Insufficient according to an Off-Policy MeasureCode1
Feature and Instance Joint Selection: A Reinforcement Learning Perspective0
Fire Burns, Sword Cuts: Commonsense Inductive Bias for Exploration in Text-based GamesCode0
On Machine Learning-Driven Surrogates for Sound Transmission Loss SimulationsCode0
A Variational Approach to Bayesian Phylogenetic InferenceCode0
Efficient Exploration via First-Person Behavior Cloning Assisted Rapidly-Exploring Random Trees0
TANDEM: Learning Joint Exploration and Decision Making with Tactile Sensors0
Collaborative Training of Heterogeneous Reinforcement Learning Agents in Environments with Sparse Rewards: What and When to Share?Code0
Think Global, Act Local: Dual-scale Graph Transformer for Vision-and-Language NavigationCode2
Learning Causal Overhypotheses through Exploration in Children and Computational Models0
A Unified Perspective on Value Backup and Exploration in Monte-Carlo Tree Search0
Online Decision TransformerCode2
Lagrangian Manifold Monte Carlo on Monge PatchesCode0
Efficient Policy Space Response Oracles0
Learning to Act with Affordance-Aware Multimodal Neural SLAMCode0
Show:102550
← PrevPage 11 of 21Next →

No leaderboard results yet.