SOTAVerified

Efficient Exploration

Efficient Exploration is one of the main obstacles in scaling up modern deep reinforcement learning algorithms. The main challenge in Efficient Exploration is the balance between exploiting current estimates, and gaining information about poorly understood states and actions.

Source: Randomized Value Functions via Multiplicative Normalizing Flows

Papers

Showing 151200 of 514 papers

TitleStatusHype
Collaborative Training of Heterogeneous Reinforcement Learning Agents in Environments with Sparse Rewards: What and When to Share?Code0
IN-RIL: Interleaved Reinforcement and Imitation Learning for Policy Fine-TuningCode0
Concurrent Meta Reinforcement LearningCode0
Efficient Exploration through Bayesian Deep Q-NetworksCode0
Efficient Exploration in Average-Reward Constrained Reinforcement Learning: Achieving Near-Optimal Regret With Posterior SamplingCode0
A Fast and Scalable Polyatomic Frank-Wolfe Algorithm for the LASSOCode0
ConEx: Efficient Exploration of Big-Data System Configurations for Better PerformanceCode0
Model-based Reinforcement Learning for Continuous Control with Posterior SamplingCode0
CM3: Cooperative Multi-goal Multi-stage Multi-agent Reinforcement LearningCode0
Heterogeneous Multi-player Multi-armed Bandits: Closing the Gap and GeneralizationCode0
Go Beyond Imagination: Maximizing Episodic Reachability with World ModelsCode0
Hierarchically Organized Latent Modules for Exploratory Search in Morphogenetic SystemsCode0
GLIB: Efficient Exploration for Relational Model-Based Reinforcement Learning via Goal-Literal BabblingCode0
A Variational Approach to Bayesian Phylogenetic InferenceCode0
Efficient Low-Rank Matrix Estimation, Experimental Design, and Arm-Set-Dependent Low-Rank BanditsCode0
Efficient Off-Policy Meta-Reinforcement Learning via Probabilistic Context VariablesCode0
GenPlan: Generative Sequence Models as Adaptive PlannersCode0
Goal-Reaching Policy Learning from Non-Expert Observations via Effective Subgoal GuidanceCode0
Hierarchical Spatial Proximity Reasoning for Vision-and-Language NavigationCode0
Meta-Learning Integration in Hierarchical Reinforcement Learning for Advanced Task ComplexityCode0
Multi-Objective Hyperparameter Selection via Hypothesis Testing on Reliability GraphsCode0
Multi-objective Model-based Policy Search for Data-efficient Learning with Sparse RewardsCode0
A New Bandit Setting Balancing Information from State Evolution and Corrupted ContextCode0
Efficient Bias-Span-Constrained Exploration-Exploitation in Reinforcement LearningCode0
Generalization and Exploration via Randomized Value FunctionsCode0
Few-shot_LLM_Synthetic_Data_with_Distribution_MatchingCode0
Count-Based Exploration with the Successor RepresentationCode0
Noisy Networks for ExplorationCode0
Fire Burns, Sword Cuts: Commonsense Inductive Bias for Exploration in Text-based GamesCode0
Personalized Algorithmic Recourse with Preference ElicitationCode0
Estimating Risk and Uncertainty in Deep Reinforcement LearningCode0
EXPODE: EXploiting POlicy Discrepancy for Efficient Exploration in Multi-agent Reinforcement LearningCode0
A diversity-enhanced genetic algorithm for efficient exploration of parameter spacesCode0
Feature Interaction Aware Automated Data Representation TransformationCode0
Diversity Actor-Critic: Sample-Aware Entropy Regularization for Sample-Efficient ExplorationCode0
Online Limited Memory Neural-Linear Bandits with Likelihood MatchingCode0
Exploring through Random Curiosity with General Value FunctionsCode0
Federated Control with Hierarchical Multi-Agent Deep Reinforcement LearningCode0
CURIOUS: Intrinsically Motivated Modular Multi-Goal Reinforcement LearningCode0
LECO: Learnable Episodic Count for Task-Specific Intrinsic RewardCode0
Bayesian Curiosity for Efficient Exploration in Reinforcement LearningCode0
Principled Exploration via Optimistic Bootstrapping and Backward InductionCode0
Distributional Perturbation for Efficient Exploration in Distributional Reinforcement Learning0
Distilling Realizable Students from Unrealizable Teachers0
Discovering Context Specific Causal Relationships0
BooVI: Provably Efficient Bootstrapped Value Iteration0
DISCO-10M: A Large-Scale Music Dataset0
Directed Exploration in PAC Model-Free Reinforcement Learning0
Biased Estimates of Advantages over Path Ensembles0
A Simple Unified Uncertainty-Guided Framework for Offline-to-Online Reinforcement Learning0
Show:102550
← PrevPage 4 of 11Next →

No leaderboard results yet.