SOTAVerified

Efficient Exploration

Efficient Exploration is one of the main obstacles in scaling up modern deep reinforcement learning algorithms. The main challenge in Efficient Exploration is the balance between exploiting current estimates, and gaining information about poorly understood states and actions.

Source: Randomized Value Functions via Multiplicative Normalizing Flows

Papers

Showing 101150 of 514 papers

TitleStatusHype
Instance Temperature Knowledge DistillationCode0
ASCENT: Amplifying Power Side-Channel Resilience via Learning & Monte-Carlo Tree SearchCode0
AutoRAG-HP: Automatic Online Hyper-Parameter Tuning for Retrieval-Augmented Generation0
Efficient gPC-based quantification of probabilistic robustness for systems in neuroscience0
Exploration by Learning Diverse Skills through Successor State Measures0
World Models with Hints of Large Language Models for Goal Achieving0
OTO Planner: An Efficient Only Travelling Once Exploration Planner for Complex and Unknown EnvironmentsCode0
Robust quantum dots charge autotuning using neural network uncertaintyCode0
Sound Heuristic Search Value Iteration for Undiscounted POMDPs with Reachability ObjectivesCode0
Efficient Exploration of the Rashomon Set of Rule Set ModelsCode0
NeoRL: Efficient Exploration for Nonepisodic RL0
Computing low-thrust transfers in the asteroid belt, a comparison between astrodynamical manipulations and a machine learning approach0
Efficient Exploration in Average-Reward Constrained Reinforcement Learning: Achieving Near-Optimal Regret With Posterior SamplingCode0
Opinion-Guided Reinforcement Learning0
Evolutionary Large Language Model for Automated Feature TransformationCode1
GLaD: Synergizing Molecular Graphs and Language Descriptors for Enhanced Power Conversion Efficiency Prediction in Organic Photovoltaic Devices0
Intrinsic Rewards for Exploration without Harm from Observational Noise: A Simulation Study Based on the Free Energy Principle0
Navigating Chemical Space with Latent FlowsCode1
MESA: Cooperative Meta-Exploration in Multi-Agent Learning through Exploiting State-Action Space Structure0
Efficient Exploration of Image Classifier Failures with Bayesian Optimization and Text-to-Image Models0
Evolutionary Reinforcement Learning via Cooperative Coevolution0
An Offline Reinforcement Learning Algorithm Customized for Multi-Task Fusion in Large-Scale Recommender Systems0
Sampling for Model Predictive Trajectory Planning in Autonomous Driving using Normalizing Flows0
Streamlining Ocean Dynamics Modeling with Fourier Neural Operators: A Multiobjective Hyperparameter and Architecture Optimization ApproachCode7
Learning Off-policy with Model-based Intrinsic Motivation For Active Online Exploration0
Cognitive Planning for Object Goal Navigation using Generative AI Models0
VDSC: Enhancing Exploration Timing with Value Discrepancy and State Counts0
Explore until Confident: Efficient Exploration for Embodied Question Answering0
Safe Reinforcement Learning for Constrained Markov Decision Processes with Stochastic Stopping Time0
A Straightforward Gradient-Based Approach for High-Tc Superconductor Design: Leveraging Domain Knowledge via Adaptive Constraints0
Hierarchical Spatial Proximity Reasoning for Vision-and-Language NavigationCode0
Diffusion-Reinforcement Learning Hierarchical Motion Planning in Multi-agent Adversarial GamesCode1
MAMBA: an Effective World Model Approach for Meta-Reinforcement LearningCode1
Beyond Joint Demonstrations: Personalized Expert Guidance for Efficient Multi-Agent Reinforcement Learning0
Scalable Online Exploration via CoverabilityCode0
A Natural Extension To Online Algorithms For Hybrid RL With Limited Coverage0
Vlearn: Off-Policy Learning with Efficient State-Value Function Estimation0
Finding Waldo: Towards Efficient Exploration of NeRF Scene Spaces0
Noisy Spiking Actor Network for Exploration0
Cradle: Empowering Foundation Agents Towards General Computer ControlCode7
GenNBV: Generalizable Next-Best-View Policy for Active 3D ReconstructionCode2
ACE : Off-Policy Actor-Critic with Causality-Aware Entropy Regularization0
Efficient Low-Rank Matrix Estimation, Experimental Design, and Arm-Set-Dependent Low-Rank BanditsCode0
Diffusion Models Meet Contextual Bandits with Large Action Spaces0
Noise-Adaptive Confidence Sets for Linear Bandits and Application to Bayesian OptimizationCode0
Diffusion-ES: Gradient-free Planning with Diffusion for Autonomous Driving and Zero-Shot Instruction FollowingCode2
Iterated Denoising Energy Matching for Sampling from Boltzmann DensitiesCode2
Safe Guaranteed Exploration for Non-linear SystemsCode1
A Sober Look at LLMs for Material Discovery: Are They Actually Good for Bayesian Optimization Over Molecules?Code1
LtU-ILI: An All-in-One Framework for Implicit Inference in Astrophysics and CosmologyCode2
Show:102550
← PrevPage 3 of 11Next →

No leaderboard results yet.