SOTAVerified

Efficient Exploration

Efficient Exploration is one of the main obstacles in scaling up modern deep reinforcement learning algorithms. The main challenge in Efficient Exploration is the balance between exploiting current estimates, and gaining information about poorly understood states and actions.

Source: Randomized Value Functions via Multiplicative Normalizing Flows

Papers

Showing 176200 of 514 papers

TitleStatusHype
Evolutionary Reinforcement Learning via Cooperative Coevolution0
An Offline Reinforcement Learning Algorithm Customized for Multi-Task Fusion in Large-Scale Recommender Systems0
Sampling for Model Predictive Trajectory Planning in Autonomous Driving using Normalizing Flows0
Learning Off-policy with Model-based Intrinsic Motivation For Active Online Exploration0
Cognitive Planning for Object Goal Navigation using Generative AI Models0
VDSC: Enhancing Exploration Timing with Value Discrepancy and State Counts0
Safe Reinforcement Learning for Constrained Markov Decision Processes with Stochastic Stopping Time0
Explore until Confident: Efficient Exploration for Embodied Question Answering0
A Straightforward Gradient-Based Approach for High-Tc Superconductor Design: Leveraging Domain Knowledge via Adaptive Constraints0
Hierarchical Spatial Proximity Reasoning for Vision-and-Language NavigationCode0
Beyond Joint Demonstrations: Personalized Expert Guidance for Efficient Multi-Agent Reinforcement Learning0
Scalable Online Exploration via CoverabilityCode0
Finding Waldo: Towards Efficient Exploration of NeRF Scene Spaces0
A Natural Extension To Online Algorithms For Hybrid RL With Limited Coverage0
Vlearn: Off-Policy Learning with Efficient State-Value Function Estimation0
Noisy Spiking Actor Network for Exploration0
ACE : Off-Policy Actor-Critic with Causality-Aware Entropy Regularization0
Efficient Low-Rank Matrix Estimation, Experimental Design, and Arm-Set-Dependent Low-Rank BanditsCode0
Diffusion Models Meet Contextual Bandits with Large Action Spaces0
Noise-Adaptive Confidence Sets for Linear Bandits and Application to Bayesian OptimizationCode0
TopoNav: Topological Navigation for Efficient Exploration in Sparse Reward Environments0
Q-Star Meets Scalable Posterior Sampling: Bridging Theory and Practice via HyperAgentCode0
Efficient Exploration for LLMs0
Scheduled Curiosity-Deep Dyna-Q: Efficient Exploration for Dialog Policy Learning0
FIT-SLAM -- Fisher Information and Traversability estimation-based Active SLAM for exploration in 3D environments0
Show:102550
← PrevPage 8 of 21Next →

No leaderboard results yet.