SOTAVerified

Efficient Exploration

Efficient Exploration is one of the main obstacles in scaling up modern deep reinforcement learning algorithms. The main challenge in Efficient Exploration is the balance between exploiting current estimates, and gaining information about poorly understood states and actions.

Source: Randomized Value Functions via Multiplicative Normalizing Flows

Papers

Showing 301325 of 514 papers

TitleStatusHype
The Role of Coverage in Online Reinforcement Learning0
The University of Cambridge Russian-English System at WMT130
Thompson Sampling Algorithms for Cascading Bandits0
TopoNav: Topological Navigation for Efficient Exploration in Sparse Reward Environments0
Towards A Unified Agent with Foundation Models0
Uncertainty Estimates for Efficient Neural Network-based Dialogue Policy Optimisation0
Reinforcement Learning in Credit Scoring and Underwriting0
Using Non-Stationary Bandits for Learning in Repeated Cournot Games with Non-Stationary Demand0
Variational Dynamic for Self-Supervised Exploration in Deep Reinforcement Learning0
VASE: Variational Assorted Surprise Exploration for Reinforcement Learning0
VDSC: Enhancing Exploration Timing with Value Discrepancy and State Counts0
Vector Quantization using the Improved Differential Evolution Algorithm for Image Compression0
Virtual Action Actor-Critic Framework for Exploration (Student Abstract)0
Visual Analytics for Efficient Image Exploration and User-Guided Image Captioning0
Vlearn: Off-Policy Learning with Efficient State-Value Function Estimation0
Volumetric Spanners: an Efficient Exploration Basis for Learning0
Weakly-Supervised Reinforcement Learning for Controllable Behavior0
When Simple Exploration is Sample Efficient: Identifying Sufficient Conditions for Random Exploration to Yield PAC RL Algorithms0
Where2Explore: Few-shot Affordance Learning for Unseen Novel Categories of Articulated Objects0
WoMAP: World Models For Embodied Open-Vocabulary Object Localization0
World Models with Hints of Large Language Models for Goal Achieving0
KOI: Accelerating Online Imitation Learning via Hybrid Key-state Guidance0
Worst-Case Regret Bounds for Exploration via Randomized Value Functions0
Comparative Analysis of Black-Box Optimization Methods for Weather Intervention Design0
A Bayesian Framework of Deep Reinforcement Learning for Joint O-RAN/MEC Orchestration0
Show:102550
← PrevPage 13 of 21Next →

No leaderboard results yet.