SOTAVerified

Efficient Exploration

Efficient Exploration is one of the main obstacles in scaling up modern deep reinforcement learning algorithms. The main challenge in Efficient Exploration is the balance between exploiting current estimates, and gaining information about poorly understood states and actions.

Source: Randomized Value Functions via Multiplicative Normalizing Flows

Papers

Showing 101150 of 514 papers

TitleStatusHype
Synergistic Fusion of Multi-Source Knowledge via Evidence Theory for High-Entropy Alloy Discovery0
DiffExp: Efficient Exploration in Reward Fine-tuning for Text-to-Image Diffusion Models0
FragFM: Hierarchical Framework for Efficient Molecule Generation via Fragment-Level Discrete Flow Matching0
Finedeep: Mitigating Sparse Activation in Dense LLMs via Multi-Layer Fine-Grained Experts0
Massively Scaling Explicit Policy-conditioned Value Functions0
Causal Information Prioritization for Efficient Reinforcement Learning0
Exploratory Diffusion Model for Unsupervised Reinforcement Learning0
Guided Exploration for Efficient Relational Model Learning0
Few-shot_LLM_Synthetic_Data_with_Distribution_MatchingCode0
Adaptive Exploration for Multi-Reward Multi-Policy Evaluation0
Constrained Hybrid Metaheuristic Algorithm for Probabilistic Neural Networks Learning0
Mapping Galaxy Images Across Ultraviolet, Visible and Infrared Bands Using Generative Deep LearningCode0
Multi-Objective Hyperparameter Selection via Hypothesis Testing on Reliability GraphsCode0
Bridging Text and Crystal Structures: Literature-driven Contrastive Learning for Materials Science0
ActiveGAMER: Active GAussian Mapping through Efficient Rendering0
β-DQN: Improving Deep Q-Learning By Evolving the Behavior0
Provably Efficient Exploration in Reward Machines with Low Regret0
A diversity-enhanced genetic algorithm for efficient exploration of parameter spacesCode0
GraphEQA: Using 3D Semantic Scene Graphs for Real-time Embodied Question Answering0
GenPlan: Generative Sequence Models as Adaptive PlannersCode0
A Temporally Correlated Latent Exploration for Reinforcement Learning0
Hyper: Hyperparameter Robust Efficient Exploration in Reinforcement Learning0
Sample Efficient Robot Learning in Supervised Effect Prediction Tasks0
CBOL-Tuner: Classifier-pruned Bayesian optimization to explore temporally structured latent spaces for particle accelerator tuning0
Adaptformer: Sequence models as adaptive iterative planners0
Randomized-Grid Search for Hyperparameter Tuning in Decision Tree Model to Improve Performance of Cardiovascular Disease Classification0
BPP-Search: Enhancing Tree of Thought Reasoning for Mathematical Modeling Problem SolvingCode0
Umbrella Reinforcement Learning -- computationally efficient tool for hard non-linear problemsCode0
Learning Dynamic Cognitive Map with Autonomous NavigationCode0
Scalable Sampling for High Utility PatternsCode0
Overcoming the Sim-to-Real Gap: Leveraging Simulation to Learn to Explore for Real-World RL0
EfficientEQA: An Efficient Approach for Open Vocabulary Embodied Question Answering0
Offline-to-Online Multi-Agent Reinforcement Learning with Offline Value Function Memory and Sequential Exploration0
Scattered Forest Search: Smarter Code Space Exploration with LLMs0
TOP-ERL: Transformer-based Off-Policy Episodic Reinforcement LearningCode0
ActSafe: Active Exploration with Safety Constraints for Reinforcement Learning0
Meta-Learning Integration in Hierarchical Reinforcement Learning for Advanced Task ComplexityCode0
Latent Action Priors for Locomotion with Deep Reinforcement Learning0
BayesCNS: A Unified Bayesian Approach to Address Cold Start and Non-Stationarity in Search Systems at Scale0
Adaptive teachers for amortized samplersCode0
Provably Efficient Exploration in Inverse Constrained Reinforcement Learning0
QueryBuilder: Human-in-the-Loop Query Development for Information Retrieval0
Goal-Reaching Policy Learning from Non-Expert Observations via Effective Subgoal GuidanceCode0
Targeting the partition function of chemically disordered materials with a generative approach based on inverse variational autoencoders0
Reinforcement Learning for Causal Discovery without Acyclicity Constraints0
Emotion-Agent: Unsupervised Deep Reinforcement Learning with Distribution-Prototype Reward for Continuous Emotional EEG Analysis0
Efficient Exploration and Discriminative World Model Learning with an Object-Centric Abstraction0
Efficient Exploration in Deep Reinforcement Learning: A Novel Bayesian Actor-Critic Algorithm0
Modeling Multi-Step Scientific Processes with Graph Transformer Networks0
KOI: Accelerating Online Imitation Learning via Hybrid Key-state Guidance0
Show:102550
← PrevPage 3 of 11Next →

No leaderboard results yet.