SOTAVerified

Efficient Exploration

Efficient Exploration is one of the main obstacles in scaling up modern deep reinforcement learning algorithms. The main challenge in Efficient Exploration is the balance between exploiting current estimates, and gaining information about poorly understood states and actions.

Source: Randomized Value Functions via Multiplicative Normalizing Flows

Papers

Showing 2650 of 514 papers

TitleStatusHype
A Sober Look at LLMs for Material Discovery: Are They Actually Good for Bayesian Optimization Over Molecules?Code1
Layered and Staged Monte Carlo Tree Search for SMT Strategy SynthesisCode1
PGDQN: Preference-Guided Deep Q-NetworkCode1
Improving Protein Optimization with Smoothed Fitness LandscapesCode1
Tuning Legged Locomotion Controllers via Safe Bayesian OptimizationCode1
A Survey of Label-Efficient Deep Learning for 3D Point CloudsCode1
Provable and Practical: Efficient Exploration in Reinforcement Learning via Langevin Monte CarloCode1
Generative Colorization of Structured Mobile Web PagesCode1
GeoThermalCloud: Machine Learning for Geothermal Resource ExplorationCode1
Learning Dexterous Manipulation from Exemplar Object Trajectories and Pre-GraspsCode1
SC-Explorer: Incremental 3D Scene Completion for Safe and Efficient Exploration Mapping and PlanningCode1
A Langevin-like Sampler for Discrete DistributionsCode1
Learning Math Reasoning from Self-Sampled Correct and Partially-Correct SolutionsCode1
Learning to Solve Combinatorial Graph Partitioning Problems via Efficient ExplorationCode1
The Sufficiency of Off-Policyness and Soft Clipping: PPO is still Insufficient according to an Off-Policy MeasureCode1
NovelD: A Simple yet Effective Exploration CriterionCode1
Episodic Multi-agent Reinforcement Learning with Curiosity-Driven ExplorationCode1
Landmark-Guided Subgoal Generation in Hierarchical Reinforcement LearningCode1
Hierarchical Skills for Efficient ExplorationCode1
HyperDQN: A Randomized Exploration Method for Deep Reinforcement LearningCode1
Strategically Efficient Exploration in Competitive Multi-agent Reinforcement LearningCode1
MADE: Exploration via Maximizing Deviation from Explored RegionsCode1
Deep Bandits Show-Off: Simple and Efficient Exploration with Deep NetworksCode1
Paradiseo: From a Modular Framework for Evolutionary Computation to the Automated Design of Metaheuristics ---22 Years of Paradiseo---Code1
State Entropy Maximization with Random Encoders for Efficient ExplorationCode1
Show:102550
← PrevPage 2 of 21Next →

No leaderboard results yet.