Sample-Efficient Alignment for LLMs Nov 3, 2024 Thompson Sampling
Code Code Available 45 Langevin Monte Carlo for Contextual Bandits Jun 22, 2022 Multi-Armed Bandits Thompson Sampling
Code Code Available 15 Federated Bayesian Optimization via Thompson Sampling Oct 20, 2020 Bayesian Optimization Computational Efficiency
Code Code Available 15 Optimal Thompson Sampling strategies for support-aware CVaR bandits Dec 10, 2020 Thompson Sampling
Code Code Available 15 A Tutorial on Thompson Sampling Jul 7, 2017 Active Learning Product Recommendation
Code Code Available 15 An empirical evaluation of active inference in multi-armed bandits Jan 21, 2021 BIG-bench Machine Learning Decision Making
Code Code Available 15 Mercer Features for Efficient Combinatorial Bayesian Optimization Dec 14, 2020 Bayesian Optimization Thompson Sampling
Code Code Available 15 Optimizing Posterior Samples for Bayesian Optimization via Rootfinding Oct 29, 2024 Bayesian Optimization global-optimization
Code Code Available 15 Steering Generative Models with Experimental Data for Protein Fitness Optimization May 21, 2025 Bayesian Optimization Thompson Sampling
Code Code Available 15 Transformers as Decision Makers: Provable In-Context Reinforcement Learning via Supervised Pretraining Oct 12, 2023 In-Context Reinforcement Learning reinforcement-learning
Code Code Available 15 Bayesian Optimization over Permutation Spaces Dec 2, 2021 Bayesian Optimization Heuristic Search
Code Code Available 15 Neural Exploitation and Exploration of Contextual Bandits May 5, 2023 Multi-Armed Bandits Thompson Sampling
Code Code Available 15 Approximate Thompson Sampling via Epistemic Neural Networks Feb 18, 2023 Thompson Sampling
Code Code Available 15 A Bayesian Approach to Online Planning Jun 4, 2024 Thompson Sampling Uncertainty Quantification
Code Code Available 15 Provable and Practical: Efficient Exploration in Reinforcement Learning via Langevin Monte Carlo May 29, 2023 Efficient Exploration reinforcement-learning
Code Code Available 15 Langevin Soft Actor-Critic: Efficient Exploration through Uncertainty-Driven Critic Learning Jan 29, 2025 continuous-control Continuous Control
Code Code Available 15 Meta-Learning Stationary Stochastic Process Prediction with Convolutional Neural Processes Jul 2, 2020 Meta-Learning Thompson Sampling
Code Code Available 15 On Isometry Robustness of Deep 3D Point Cloud Models under Adversarial Attacks Feb 27, 2020 Thompson Sampling
Code Code Available 15 Deep Bandits Show-Off: Simple and Efficient Exploration with Deep Networks May 10, 2021 Efficient Exploration Multi-Armed Bandits
Code Code Available 15 Seamlessly Unifying Attributes and Items: Conversational Recommendation for Cold-Start Users May 23, 2020 Collaborative Filtering Conversational Recommendation
Code Code Available 15 EE-Net: Exploitation-Exploration Neural Networks in Contextual Bandits Oct 7, 2021 Multi-Armed Bandits Thompson Sampling
Code Code Available 15 Adaptive Anytime Multi-Agent Path Finding Using Bandit-Based Large Neighborhood Search Dec 28, 2023 Multi-Agent Path Finding Thompson Sampling
Code Code Available 15 Sample-Then-Optimize Batch Neural Thompson Sampling Oct 13, 2022 AutoML Bayesian Optimization
Code Code Available 15 qPOTS: Efficient batch multiobjective Bayesian optimization via Pareto optimal Thompson sampling Oct 24, 2023 Bayesian Optimization Computational Efficiency
Code Code Available 15 Dynamic Slate Recommendation with Gated Recurrent Units and Thompson Sampling Apr 30, 2021 Recommendation Systems Thompson Sampling
Code Code Available 15 Neural Thompson Sampling Oct 2, 2020 Multi-Armed Bandits Thompson Sampling
Code Code Available 15 Batched Bayesian optimization by maximizing the probability of including the optimum Oct 8, 2024 Bayesian Optimization Diversity
Code Code Available 15 Evaluating Deep Vs. Wide & Deep Learners As Contextual Bandits For Personalized Email Promo Recommendations Jan 31, 2022 Multi-Armed Bandits Thompson Sampling
Code Code Available 05 ESCADA: Efficient Safety and Context Aware Dose Allocation for Precision Medicine Nov 26, 2021 Thompson Sampling
Code Code Available 05 Evolutionary Multi-Armed Bandits with Genetic Thompson Sampling Apr 26, 2022 Decision Making Evolutionary Algorithms
Code Code Available 05 Scalable Exploration via Ensemble++ Jul 18, 2024 Computational Efficiency Decision Making
Code Code Available 05 Efficient Exploration through Bayesian Deep Q-Networks Feb 13, 2018 Atari Games Efficient Exploration
Code Code Available 05 Efficient Optimal Selection for Composited Advertising Creatives with Tree Structure Mar 2, 2021 Efficient Exploration Thompson Sampling
Code Code Available 05 Modeling Human Exploration Through Resource-Rational Reinforcement Learning Jan 27, 2022 Meta-Learning reinforcement-learning
Code Code Available 05 Dynamic Assortment Selection and Pricing with Censored Preference Feedback Apr 3, 2025 Thompson Sampling
Code Code Available 05 Double Thompson Sampling for Dueling Bandits Apr 25, 2016 Thompson Sampling
Code Code Available 05 Distributed Thompson sampling under constrained communication Oct 21, 2024 Bayesian Optimization Thompson Sampling
Code Code Available 05 AIXIjs: A Software Demo for General Reinforcement Learning May 22, 2017 General Reinforcement Learning OpenAI Gym
Code Code Available 05 Differentially Private Online Bayesian Estimation With Adaptive Truncation Jan 19, 2023 Privacy Preserving Sensitivity
Code Code Available 05 Two-sided Competing Matching Recommendation Markets With Quota and Complementary Preferences Constraints Jan 24, 2023 Thompson Sampling
Code Code Available 05 Fast, Precise Thompson Sampling for Bayesian Optimization Nov 26, 2024 Bayesian Optimization STS
Code Code Available 05 Adapting multi-armed bandits policies to contextual bandits scenarios Nov 11, 2018 Binary Classification Classification
Code Code Available 05 Cost-Efficient Online Decision Making: A Combinatorial Multi-Armed Bandit Approach Aug 21, 2023 Decision Making Multi-Armed Bandits
Code Code Available 05 Causal Bandits for Linear Structural Equation Models Aug 26, 2022 Thompson Sampling
Code Code Available 05 Process-constrained batch Bayesian approaches for yield optimization in multi-reactor systems Aug 5, 2024 Bayesian Optimization Thompson Sampling
Code Code Available 05 Constructing Adversarial Examples for Vertical Federated Learning: Optimal Client Corruption through Multi-Armed Bandit May 7, 2024 Federated Learning Thompson Sampling
Code Code Available 05 Constructing Adversarial Examples for Vertical Federated Learning: Optimal Client Corruption through Multi-Armed Bandit Aug 8, 2024 Federated Learning Thompson Sampling
Code Code Available 05 RoME: A Robust Mixed-Effects Bandit Algorithm for Optimizing Mobile Health Interventions Dec 11, 2023 Multi-Armed Bandits Off-policy evaluation
Code Code Available 05 Anytime Multi-Agent Path Finding with an Adaptive Delay-Based Heuristic Aug 6, 2024 Multi-Agent Path Finding Self-Learning
Code Code Available 05 Bayesian Optimization for Categorical and Category-Specific Continuous Inputs Nov 28, 2019 Bayesian Optimization BIG-bench Machine Learning
Code Code Available 05