SOTAVerified

Efficient Exploration

Efficient Exploration is one of the main obstacles in scaling up modern deep reinforcement learning algorithms. The main challenge in Efficient Exploration is the balance between exploiting current estimates, and gaining information about poorly understood states and actions.

Source: Randomized Value Functions via Multiplicative Normalizing Flows

Papers

Showing 151200 of 514 papers

TitleStatusHype
Image-Based Deep Reinforcement Learning with Intrinsically Motivated Stimuli: On the Execution of Complex Robotic Tasks0
Diffusion Augmented Agents: A Framework for Efficient Exploration and Transfer Learning0
Online Learning for Autonomous Management of Intent-based 6G Networks0
ParamsDrag: Interactive Parameter Space Exploration via Image-Space Dragging0
Scalable Exploration via Ensemble++Code0
Preference-Guided Reinforcement Learning for Efficient ExplorationCode0
Uncertainty-Guided Optimization on Large Language Model Search TreesCode0
ASCENT: Amplifying Power Side-Channel Resilience via Learning & Monte-Carlo Tree SearchCode0
Instance Temperature Knowledge DistillationCode0
AutoRAG-HP: Automatic Online Hyper-Parameter Tuning for Retrieval-Augmented Generation0
Efficient gPC-based quantification of probabilistic robustness for systems in neuroscience0
Exploration by Learning Diverse Skills through Successor State Measures0
World Models with Hints of Large Language Models for Goal Achieving0
OTO Planner: An Efficient Only Travelling Once Exploration Planner for Complex and Unknown EnvironmentsCode0
Robust quantum dots charge autotuning using neural network uncertaintyCode0
Efficient Exploration of the Rashomon Set of Rule Set ModelsCode0
Sound Heuristic Search Value Iteration for Undiscounted POMDPs with Reachability ObjectivesCode0
NeoRL: Efficient Exploration for Nonepisodic RL0
Computing low-thrust transfers in the asteroid belt, a comparison between astrodynamical manipulations and a machine learning approach0
Efficient Exploration in Average-Reward Constrained Reinforcement Learning: Achieving Near-Optimal Regret With Posterior SamplingCode0
Opinion-Guided Reinforcement Learning0
GLaD: Synergizing Molecular Graphs and Language Descriptors for Enhanced Power Conversion Efficiency Prediction in Organic Photovoltaic Devices0
Intrinsic Rewards for Exploration without Harm from Observational Noise: A Simulation Study Based on the Free Energy Principle0
MESA: Cooperative Meta-Exploration in Multi-Agent Learning through Exploiting State-Action Space Structure0
Efficient Exploration of Image Classifier Failures with Bayesian Optimization and Text-to-Image Models0
Evolutionary Reinforcement Learning via Cooperative Coevolution0
An Offline Reinforcement Learning Algorithm Customized for Multi-Task Fusion in Large-Scale Recommender Systems0
Sampling for Model Predictive Trajectory Planning in Autonomous Driving using Normalizing Flows0
Learning Off-policy with Model-based Intrinsic Motivation For Active Online Exploration0
Cognitive Planning for Object Goal Navigation using Generative AI Models0
VDSC: Enhancing Exploration Timing with Value Discrepancy and State Counts0
Safe Reinforcement Learning for Constrained Markov Decision Processes with Stochastic Stopping Time0
Explore until Confident: Efficient Exploration for Embodied Question Answering0
A Straightforward Gradient-Based Approach for High-Tc Superconductor Design: Leveraging Domain Knowledge via Adaptive Constraints0
Hierarchical Spatial Proximity Reasoning for Vision-and-Language NavigationCode0
Beyond Joint Demonstrations: Personalized Expert Guidance for Efficient Multi-Agent Reinforcement Learning0
Scalable Online Exploration via CoverabilityCode0
Finding Waldo: Towards Efficient Exploration of NeRF Scene Spaces0
A Natural Extension To Online Algorithms For Hybrid RL With Limited Coverage0
Vlearn: Off-Policy Learning with Efficient State-Value Function Estimation0
Noisy Spiking Actor Network for Exploration0
ACE : Off-Policy Actor-Critic with Causality-Aware Entropy Regularization0
Efficient Low-Rank Matrix Estimation, Experimental Design, and Arm-Set-Dependent Low-Rank BanditsCode0
Diffusion Models Meet Contextual Bandits with Large Action Spaces0
Noise-Adaptive Confidence Sets for Linear Bandits and Application to Bayesian OptimizationCode0
TopoNav: Topological Navigation for Efficient Exploration in Sparse Reward Environments0
Q-Star Meets Scalable Posterior Sampling: Bridging Theory and Practice via HyperAgentCode0
Efficient Exploration for LLMs0
Scheduled Curiosity-Deep Dyna-Q: Efficient Exploration for Dialog Policy Learning0
FIT-SLAM -- Fisher Information and Traversability estimation-based Active SLAM for exploration in 3D environments0
Show:102550
← PrevPage 4 of 11Next →

No leaderboard results yet.