SOTAVerified

Efficient Exploration

Efficient Exploration is one of the main obstacles in scaling up modern deep reinforcement learning algorithms. The main challenge in Efficient Exploration is the balance between exploiting current estimates, and gaining information about poorly understood states and actions.

Source: Randomized Value Functions via Multiplicative Normalizing Flows

Papers

Showing 126150 of 514 papers

TitleStatusHype
Cognitive Planning for Object Goal Navigation using Generative AI Models0
VDSC: Enhancing Exploration Timing with Value Discrepancy and State Counts0
Explore until Confident: Efficient Exploration for Embodied Question Answering0
Safe Reinforcement Learning for Constrained Markov Decision Processes with Stochastic Stopping Time0
A Straightforward Gradient-Based Approach for High-Tc Superconductor Design: Leveraging Domain Knowledge via Adaptive Constraints0
Hierarchical Spatial Proximity Reasoning for Vision-and-Language NavigationCode0
Diffusion-Reinforcement Learning Hierarchical Motion Planning in Multi-agent Adversarial GamesCode1
MAMBA: an Effective World Model Approach for Meta-Reinforcement LearningCode1
Beyond Joint Demonstrations: Personalized Expert Guidance for Efficient Multi-Agent Reinforcement Learning0
Scalable Online Exploration via CoverabilityCode0
A Natural Extension To Online Algorithms For Hybrid RL With Limited Coverage0
Vlearn: Off-Policy Learning with Efficient State-Value Function Estimation0
Finding Waldo: Towards Efficient Exploration of NeRF Scene Spaces0
Noisy Spiking Actor Network for Exploration0
Cradle: Empowering Foundation Agents Towards General Computer ControlCode7
GenNBV: Generalizable Next-Best-View Policy for Active 3D ReconstructionCode2
ACE : Off-Policy Actor-Critic with Causality-Aware Entropy Regularization0
Efficient Low-Rank Matrix Estimation, Experimental Design, and Arm-Set-Dependent Low-Rank BanditsCode0
Diffusion Models Meet Contextual Bandits with Large Action Spaces0
Noise-Adaptive Confidence Sets for Linear Bandits and Application to Bayesian OptimizationCode0
Diffusion-ES: Gradient-free Planning with Diffusion for Autonomous Driving and Zero-Shot Instruction FollowingCode2
Iterated Denoising Energy Matching for Sampling from Boltzmann DensitiesCode2
Safe Guaranteed Exploration for Non-linear SystemsCode1
A Sober Look at LLMs for Material Discovery: Are They Actually Good for Bayesian Optimization Over Molecules?Code1
LtU-ILI: An All-in-One Framework for Implicit Inference in Astrophysics and CosmologyCode2
Show:102550
← PrevPage 6 of 21Next →

No leaderboard results yet.