Sample-Efficient Alignment for LLMs Nov 3, 2024 Thompson Sampling
Code Code Available 4Steering Generative Models with Experimental Data for Protein Fitness Optimization May 21, 2025 Bayesian Optimization Thompson Sampling
Code Code Available 1Langevin Soft Actor-Critic: Efficient Exploration through Uncertainty-Driven Critic Learning Jan 29, 2025 continuous-control Continuous Control
Code Code Available 1Optimizing Posterior Samples for Bayesian Optimization via Rootfinding Oct 29, 2024 Bayesian Optimization global-optimization
Code Code Available 1Batched Bayesian optimization by maximizing the probability of including the optimum Oct 8, 2024 Bayesian Optimization Diversity
Code Code Available 1A Bayesian Approach to Online Planning Jun 4, 2024 Thompson Sampling Uncertainty Quantification
Code Code Available 1Adaptive Anytime Multi-Agent Path Finding Using Bandit-Based Large Neighborhood Search Dec 28, 2023 Multi-Agent Path Finding Thompson Sampling
Code Code Available 1qPOTS: Efficient batch multiobjective Bayesian optimization via Pareto optimal Thompson sampling Oct 24, 2023 Bayesian Optimization Computational Efficiency
Code Code Available 1Transformers as Decision Makers: Provable In-Context Reinforcement Learning via Supervised Pretraining Oct 12, 2023 In-Context Reinforcement Learning reinforcement-learning
Code Code Available 1Provable and Practical: Efficient Exploration in Reinforcement Learning via Langevin Monte Carlo May 29, 2023 Efficient Exploration reinforcement-learning
Code Code Available 1Neural Exploitation and Exploration of Contextual Bandits May 5, 2023 Multi-Armed Bandits Thompson Sampling
Code Code Available 1Approximate Thompson Sampling via Epistemic Neural Networks Feb 18, 2023 Thompson Sampling
Code Code Available 1Sample-Then-Optimize Batch Neural Thompson Sampling Oct 13, 2022 AutoML Bayesian Optimization
Code Code Available 1Langevin Monte Carlo for Contextual Bandits Jun 22, 2022 Multi-Armed Bandits Thompson Sampling
Code Code Available 1Bayesian Optimization over Permutation Spaces Dec 2, 2021 Bayesian Optimization Heuristic Search
Code Code Available 1EE-Net: Exploitation-Exploration Neural Networks in Contextual Bandits Oct 7, 2021 Multi-Armed Bandits Thompson Sampling
Code Code Available 1Deep Bandits Show-Off: Simple and Efficient Exploration with Deep Networks May 10, 2021 Efficient Exploration Multi-Armed Bandits
Code Code Available 1Dynamic Slate Recommendation with Gated Recurrent Units and Thompson Sampling Apr 30, 2021 Recommendation Systems Thompson Sampling
Code Code Available 1An empirical evaluation of active inference in multi-armed bandits Jan 21, 2021 BIG-bench Machine Learning Decision Making
Code Code Available 1Mercer Features for Efficient Combinatorial Bayesian Optimization Dec 14, 2020 Bayesian Optimization Thompson Sampling
Code Code Available 1Optimal Thompson Sampling strategies for support-aware CVaR bandits Dec 10, 2020 Thompson Sampling
Code Code Available 1Federated Bayesian Optimization via Thompson Sampling Oct 20, 2020 Bayesian Optimization Computational Efficiency
Code Code Available 1Neural Thompson Sampling Oct 2, 2020 Multi-Armed Bandits Thompson Sampling
Code Code Available 1Meta-Learning Stationary Stochastic Process Prediction with Convolutional Neural Processes Jul 2, 2020 Meta-Learning Thompson Sampling
Code Code Available 1Seamlessly Unifying Attributes and Items: Conversational Recommendation for Cold-Start Users May 23, 2020 Collaborative Filtering Conversational Recommendation
Code Code Available 1On Isometry Robustness of Deep 3D Point Cloud Models under Adversarial Attacks Feb 27, 2020 Thompson Sampling
Code Code Available 1A Tutorial on Thompson Sampling Jul 7, 2017 Active Learning Product Recommendation
Code Code Available 1Robust Policy Switching for Antifragile Reinforcement Learning for UAV Deconfliction in Adversarial Environments Jun 26, 2025 Reinforcement Learning (RL) Thompson Sampling
— Unverified 0Context Attribution with Multi-Armed Bandit Optimization Jun 24, 2025 Thompson Sampling
— Unverified 0Adaptive Data Augmentation for Thompson Sampling Jun 17, 2025 Data Augmentation Multi-Armed Bandits
— Unverified 0Bayesian Optimization with Inexact Acquisition: Is Random Grid Search Sufficient? Jun 13, 2025 Bayesian Optimization Thompson Sampling
— Unverified 0Efficient kernelized bandit algorithms via exploration distributions Jun 11, 2025 Thompson Sampling
— Unverified 0Asymptotically Optimal Linear Best Feasible Arm Identification with Fixed Budget Jun 3, 2025 Thompson Sampling
— Unverified 0Stable Thompson Sampling: Valid Inference via Variance Inflation May 29, 2025 Decision Making Thompson Sampling
— Unverified 0Thompson Sampling in Online RLHF with General Function Approximation May 29, 2025 Thompson Sampling
— Unverified 0Simplifying Bayesian Optimization Via In-Context Direct Optimum Sampling May 29, 2025 Bayesian Optimization Thompson Sampling
— Unverified 0Practical Adversarial Attacks on Stochastic Bandits via Fake Data Injection May 28, 2025 Thompson Sampling
— Unverified 0Representative Action Selection for Large Action-Space Meta-Bandits May 23, 2025 Thompson Sampling
Code Code Available 0Deconfounded Warm-Start Thompson Sampling with Applications to Precision Medicine May 22, 2025 Thompson Sampling
— Unverified 0Scalable and Interpretable Contextual Bandits: A Literature Review and Retail Offer Prototype May 22, 2025 Feature Engineering Large Language Model
— Unverified 0Generator-Mediated Bandits: Thompson Sampling for GenAI-Powered Adaptive Interventions May 22, 2025 Large Language Model Thompson Sampling
— Unverified 0In-Domain African Languages Translation Using LLMs and Multi-armed Bandits May 21, 2025 Domain Adaptation Machine Translation
— Unverified 0Dynamic Decision-Making under Model Misspecification May 20, 2025 Decision Making model
— Unverified 0Addressing Missing Data Issue for Diffusion-based Recommendation May 18, 2025 Denoising Thompson Sampling
Code Code Available 0Thompson Sampling-like Algorithms for Stochastic Rising Bandits May 17, 2025 Model Selection Thompson Sampling
— Unverified 0Leveraging Offline Data from Similar Systems for Online Linear Quadratic Control May 14, 2025 Thompson Sampling
— Unverified 0Connecting Thompson Sampling and UCB: Towards More Efficient Trade-offs Between Privacy and Regret May 5, 2025 Thompson Sampling
— Unverified 0Bayesian learning of the optimal action-value function in a Markov decision process May 3, 2025 Decision Making Sequential Decision Making
— Unverified 0Neural Contextual Bandits Under Delayed Feedback Constraints Apr 16, 2025 Multi-Armed Bandits Recommendation Systems
— Unverified 0Counterfactual Inference under Thompson Sampling Apr 3, 2025 Causal Inference counterfactual
— Unverified 0