SOTAVerified

Efficient Exploration

Efficient Exploration is one of the main obstacles in scaling up modern deep reinforcement learning algorithms. The main challenge in Efficient Exploration is the balance between exploiting current estimates, and gaining information about poorly understood states and actions.

Source: Randomized Value Functions via Multiplicative Normalizing Flows

Papers

Showing 226250 of 514 papers

TitleStatusHype
Towards A Unified Agent with Foundation Models0
LISSNAS: Locality-based Iterative Search Space Shrinkage for Neural Architecture Search0
Approximate information for efficient exploration-exploitation strategies0
Maximum State Entropy Exploration using Predecessor and Successor Representations0
DISCO-10M: A Large-Scale Music Dataset0
Inferring Hierarchical Structure in Multi-Room Maze Environments0
Provably Efficient Representation Learning with Tractable Planning in Low-Rank POMDP0
A Simple Unified Uncertainty-Guided Framework for Offline-to-Online Reinforcement Learning0
PACER: A Fully Push-forward-based Distributional Reinforcement Learning Algorithm0
Magnitude Attention-based Dynamic Pruning0
Symmetric Replay Training: Enhancing Sample Efficiency in Deep Reinforcement Learning for Combinatorial OptimizationCode0
Large-Batch, Iteration-Efficient Neural Bayesian Design OptimizationCode0
Discovering Failure Modes of Text-guided Diffusion Models via Adversarial Search0
EXPODE: EXploiting POlicy Discrepancy for Efficient Exploration in Multi-agent Reinforcement LearningCode0
Successor-Predecessor Intrinsic Exploration0
Solving Diffusion ODEs with Optimal Boundary Conditions for Better Image Super-Resolution0
Shattering the Agent-Environment Interface for Fine-Tuning Inclusive Language Models0
Joint Falsification and Fidelity Settings Optimization for Validation of Safety-Critical Systems: A Theoretical Analysis0
Rescue Conversations from Dead-ends: Efficient Exploration for Task-oriented Dialogue Policy Optimization0
Conditionally Optimistic Exploration for Cooperative Deep Multi-Agent Reinforcement LearningCode0
Fast exploration and learning of latent graphs with aliased observations0
Exploration of the search space of Gaussian graphical models for paired data0
Policy Mirror Descent Inherently Explores Action Space0
Exploration via Epistemic Value Estimation0
Guarded Policy Optimization with Imperfect Online Demonstrations0
Show:102550
← PrevPage 10 of 21Next →

No leaderboard results yet.