Sample-Efficient Alignment for LLMs Nov 3, 2024 Thompson Sampling
Code Code Available 4Batched Bayesian optimization by maximizing the probability of including the optimum Oct 8, 2024 Bayesian Optimization Diversity
Code Code Available 1Approximate Thompson Sampling via Epistemic Neural Networks Feb 18, 2023 Thompson Sampling
Code Code Available 1Steering Generative Models with Experimental Data for Protein Fitness Optimization May 21, 2025 Bayesian Optimization Thompson Sampling
Code Code Available 1EE-Net: Exploitation-Exploration Neural Networks in Contextual Bandits Oct 7, 2021 Multi-Armed Bandits Thompson Sampling
Code Code Available 1An empirical evaluation of active inference in multi-armed bandits Jan 21, 2021 BIG-bench Machine Learning Decision Making
Code Code Available 1Deep Bandits Show-Off: Simple and Efficient Exploration with Deep Networks May 10, 2021 Efficient Exploration Multi-Armed Bandits
Code Code Available 1Meta-Learning Stationary Stochastic Process Prediction with Convolutional Neural Processes Jul 2, 2020 Meta-Learning Thompson Sampling
Code Code Available 1Sample-Then-Optimize Batch Neural Thompson Sampling Oct 13, 2022 AutoML Bayesian Optimization
Code Code Available 1Transformers as Decision Makers: Provable In-Context Reinforcement Learning via Supervised Pretraining Oct 12, 2023 In-Context Reinforcement Learning reinforcement-learning
Code Code Available 1On Isometry Robustness of Deep 3D Point Cloud Models under Adversarial Attacks Feb 27, 2020 Thompson Sampling
Code Code Available 1Langevin Monte Carlo for Contextual Bandits Jun 22, 2022 Multi-Armed Bandits Thompson Sampling
Code Code Available 1A Tutorial on Thompson Sampling Jul 7, 2017 Active Learning Product Recommendation
Code Code Available 1A Bayesian Approach to Online Planning Jun 4, 2024 Thompson Sampling Uncertainty Quantification
Code Code Available 1Neural Exploitation and Exploration of Contextual Bandits May 5, 2023 Multi-Armed Bandits Thompson Sampling
Code Code Available 1Bayesian Optimization over Permutation Spaces Dec 2, 2021 Bayesian Optimization Heuristic Search
Code Code Available 1Federated Bayesian Optimization via Thompson Sampling Oct 20, 2020 Bayesian Optimization Computational Efficiency
Code Code Available 1Mercer Features for Efficient Combinatorial Bayesian Optimization Dec 14, 2020 Bayesian Optimization Thompson Sampling
Code Code Available 1Optimizing Posterior Samples for Bayesian Optimization via Rootfinding Oct 29, 2024 Bayesian Optimization global-optimization
Code Code Available 1Provable and Practical: Efficient Exploration in Reinforcement Learning via Langevin Monte Carlo May 29, 2023 Efficient Exploration reinforcement-learning
Code Code Available 1Optimal Thompson Sampling strategies for support-aware CVaR bandits Dec 10, 2020 Thompson Sampling
Code Code Available 1Adaptive Anytime Multi-Agent Path Finding Using Bandit-Based Large Neighborhood Search Dec 28, 2023 Multi-Agent Path Finding Thompson Sampling
Code Code Available 1qPOTS: Efficient batch multiobjective Bayesian optimization via Pareto optimal Thompson sampling Oct 24, 2023 Bayesian Optimization Computational Efficiency
Code Code Available 1Seamlessly Unifying Attributes and Items: Conversational Recommendation for Cold-Start Users May 23, 2020 Collaborative Filtering Conversational Recommendation
Code Code Available 1Neural Thompson Sampling Oct 2, 2020 Multi-Armed Bandits Thompson Sampling
Code Code Available 1Langevin Soft Actor-Critic: Efficient Exploration through Uncertainty-Driven Critic Learning Jan 29, 2025 continuous-control Continuous Control
Code Code Available 1Dynamic Slate Recommendation with Gated Recurrent Units and Thompson Sampling Apr 30, 2021 Recommendation Systems Thompson Sampling
Code Code Available 1Adaptive Grey-Box Fuzz-Testing with Thompson Sampling Aug 24, 2018 Thompson Sampling
— Unverified 0Adaptive Gating for Single-Photon 3D Imaging Nov 30, 2021 Position Thompson Sampling
— Unverified 0A Combinatorial Semi-Bandit Approach to Charging Station Selection for Electric Vehicles Jan 17, 2023 Combinatorial Optimization Thompson Sampling
— Unverified 0A Closer Look at the Worst-case Behavior of Multi-armed Bandit Algorithms Jun 3, 2021 Thompson Sampling
— Unverified 0Adaptive Exploration-Exploitation Tradeoff for Opportunistic Bandits Sep 12, 2017 Thompson Sampling
— Unverified 0Context in Public Health for Underserved Communities: A Bayesian Approach to Online Restless Bandits Feb 7, 2024 Multi-Armed Bandits Reinforcement Learning (RL)
— Unverified 0Analyzing and Enhancing Queue Sampling for Energy-Efficient Remote Control of Bandits May 15, 2024 Autonomous Vehicles Thompson Sampling
— Unverified 0Adaptive Experimentation at Scale: A Computational Framework for Flexible Batches Mar 21, 2023 Benchmarking Thompson Sampling
— Unverified 0Adaptive Data Augmentation for Thompson Sampling Jun 17, 2025 Data Augmentation Multi-Armed Bandits
— Unverified 0Achieving adaptivity and optimality for multi-armed bandits using Exponential-Kullback Leibler Maillard Sampling Feb 20, 2025 Multi-Armed Bandits Thompson Sampling
— Unverified 0Adaptive Combinatorial Allocation Nov 4, 2020 Thompson Sampling
— Unverified 0A Change-Detection Based Thompson Sampling Framework for Non-Stationary Bandits Sep 6, 2020 Change Detection Thompson Sampling
— Unverified 0A Batched Multi-Armed Bandit Approach to News Headline Testing Aug 17, 2019 Articles Thompson Sampling
— Unverified 0Analysis of Thompson Sampling for Partially Observable Contextual Multi-Armed Bandits Oct 23, 2021 Decision Making Multi-Armed Bandits
— Unverified 0An Analysis of Ensemble Sampling Mar 2, 2022 Thompson Sampling
— Unverified 0Aging Bandits: Regret Analysis and Order-Optimal Learning Algorithm for Wireless Networks with Stochastic Arrivals Dec 16, 2020 Thompson Sampling
— Unverified 0A General Recipe for the Analysis of Randomized Multi-Armed Bandit Algorithms Mar 10, 2023 Thompson Sampling
— Unverified 0Accelerating Grasp Exploration by Leveraging Learned Priors Nov 11, 2020 Object Thompson Sampling
— Unverified 0A General Theory of the Stochastic Linear Bandit and Its Applications Feb 12, 2020 Multi-Armed Bandits Thompson Sampling
— Unverified 0A Formal Solution to the Grain of Truth Problem Sep 16, 2016 Thompson Sampling
— Unverified 0Algorithms for Adaptive Experiments that Trade-off Statistical Analysis with Reward: Combining Uniform Random Assignment and Reward Maximization Dec 15, 2021 Thompson Sampling
— Unverified 0Aligning AI Agents via Information-Directed Sampling Oct 18, 2024 Thompson Sampling
— Unverified 0AdaptEx: A Self-Service Contextual Bandit Platform Aug 8, 2023 Multi-Armed Bandits Thompson Sampling
— Unverified 0