Robust Policy Switching for Antifragile Reinforcement Learning for UAV Deconfliction in Adversarial Environments Jun 26, 2025 Reinforcement Learning (RL) Thompson Sampling
— Unverified 0Context Attribution with Multi-Armed Bandit Optimization Jun 24, 2025 Thompson Sampling
— Unverified 0Adaptive Data Augmentation for Thompson Sampling Jun 17, 2025 Data Augmentation Multi-Armed Bandits
— Unverified 0Bayesian Optimization with Inexact Acquisition: Is Random Grid Search Sufficient? Jun 13, 2025 Bayesian Optimization Thompson Sampling
— Unverified 0Efficient kernelized bandit algorithms via exploration distributions Jun 11, 2025 Thompson Sampling
— Unverified 0Asymptotically Optimal Linear Best Feasible Arm Identification with Fixed Budget Jun 3, 2025 Thompson Sampling
— Unverified 0Simplifying Bayesian Optimization Via In-Context Direct Optimum Sampling May 29, 2025 Bayesian Optimization Thompson Sampling
— Unverified 0Thompson Sampling in Online RLHF with General Function Approximation May 29, 2025 Thompson Sampling
— Unverified 0Stable Thompson Sampling: Valid Inference via Variance Inflation May 29, 2025 Decision Making Thompson Sampling
— Unverified 0Practical Adversarial Attacks on Stochastic Bandits via Fake Data Injection May 28, 2025 Thompson Sampling
— Unverified 0Representative Action Selection for Large Action-Space Meta-Bandits May 23, 2025 Thompson Sampling
Code Code Available 0Deconfounded Warm-Start Thompson Sampling with Applications to Precision Medicine May 22, 2025 Thompson Sampling
— Unverified 0Scalable and Interpretable Contextual Bandits: A Literature Review and Retail Offer Prototype May 22, 2025 Feature Engineering Large Language Model
— Unverified 0Generator-Mediated Bandits: Thompson Sampling for GenAI-Powered Adaptive Interventions May 22, 2025 Large Language Model Thompson Sampling
— Unverified 0In-Domain African Languages Translation Using LLMs and Multi-armed Bandits May 21, 2025 Domain Adaptation Machine Translation
— Unverified 0Steering Generative Models with Experimental Data for Protein Fitness Optimization May 21, 2025 Bayesian Optimization Thompson Sampling
Code Code Available 1Dynamic Decision-Making under Model Misspecification May 20, 2025 Decision Making model
— Unverified 0Addressing Missing Data Issue for Diffusion-based Recommendation May 18, 2025 Denoising Thompson Sampling
Code Code Available 0Thompson Sampling-like Algorithms for Stochastic Rising Bandits May 17, 2025 Model Selection Thompson Sampling
— Unverified 0Leveraging Offline Data from Similar Systems for Online Linear Quadratic Control May 14, 2025 Thompson Sampling
— Unverified 0Connecting Thompson Sampling and UCB: Towards More Efficient Trade-offs Between Privacy and Regret May 5, 2025 Thompson Sampling
— Unverified 0Bayesian learning of the optimal action-value function in a Markov decision process May 3, 2025 Decision Making Sequential Decision Making
— Unverified 0Neural Contextual Bandits Under Delayed Feedback Constraints Apr 16, 2025 Multi-Armed Bandits Recommendation Systems
— Unverified 0Counterfactual Inference under Thompson Sampling Apr 3, 2025 Causal Inference counterfactual
— Unverified 0Dynamic Assortment Selection and Pricing with Censored Preference Feedback Apr 3, 2025 Thompson Sampling
Code Code Available 0Sparse Nonparametric Contextual Bandits Mar 20, 2025 Multi-Armed Bandits Thompson Sampling
— Unverified 0Bandit-Based Prompt Design Strategy Selection Improves Prompt Optimizers Mar 3, 2025 Prompt Engineering Thompson Sampling
Code Code Available 0Achieving adaptivity and optimality for multi-armed bandits using Exponential-Kullback Leibler Maillard Sampling Feb 20, 2025 Multi-Armed Bandits Thompson Sampling
— Unverified 0An Adversarial Analysis of Thompson Sampling for Full-information Online Learning: from Finite to Infinite Action Spaces Feb 20, 2025 Bayesian Optimization Thompson Sampling
— Unverified 0Uncertainty-Aware Search and Value Models: Mitigating Search Scaling Flaws in LLMs Feb 16, 2025 GSM8K Thompson Sampling
— Unverified 0When and why randomised exploration works (in linear bandits) Feb 13, 2025 Thompson Sampling
— Unverified 0KABB: Knowledge-Aware Bayesian Bandits for Dynamic Expert Coordination in Multi-Agent Systems Feb 11, 2025 Thompson Sampling
— Unverified 0Contextual Thompson Sampling via Generation of Missing Data Feb 10, 2025 Decision Making Fairness
— Unverified 0An Information-Theoretic Analysis of Thompson Sampling with Infinite Action Spaces Feb 4, 2025 Thompson Sampling
— Unverified 0Active RLHF via Best Policy Learning from Trajectory Preference Feedback Jan 31, 2025 Thompson Sampling
— Unverified 0FedRTS: Federated Robust Pruning via Combinatorial Thompson Sampling Jan 31, 2025 Federated Learning Thompson Sampling
Code Code Available 0Langevin Soft Actor-Critic: Efficient Exploration through Uncertainty-Driven Critic Learning Jan 29, 2025 continuous-control Continuous Control
Code Code Available 1EVaDE : Event-Based Variational Thompson Sampling for Model-Based Reinforcement Learning Jan 16, 2025 Model-based Reinforcement Learning reinforcement-learning
— Unverified 0Stochastically Constrained Best Arm Identification with Thompson Sampling Jan 7, 2025 Thompson Sampling
— Unverified 0Truthful mechanisms for linear bandit games with private contexts Jan 7, 2025 Thompson Sampling
— Unverified 0WAPTS: A Weighted Allocation Probability Adjusted Thompson Sampling Algorithm for High-Dimensional and Sparse Experiment Settings Jan 7, 2025 Thompson Sampling
— Unverified 0On Improved Regret Bounds In Bayesian Optimization with Gaussian Noise Dec 25, 2024 Bayesian Optimization Thompson Sampling
— Unverified 0Generalized Bayesian deep reinforcement learning Dec 16, 2024 Deep Reinforcement Learning reinforcement-learning
— Unverified 0An Information-Theoretic Analysis of Thompson Sampling for Logistic Bandits Dec 3, 2024 Thompson Sampling
— Unverified 0BOTS: Batch Bayesian Optimization of Extended Thompson Sampling for Severely Episode-Limited RL Settings Nov 30, 2024 Bayesian Optimization Policy Gradient Methods
— Unverified 0Fast, Precise Thompson Sampling for Bayesian Optimization Nov 26, 2024 Bayesian Optimization STS
Code Code Available 0Epinet for Content Cold Start Nov 20, 2024 Recommendation Systems Thompson Sampling
— Unverified 0Sample-Efficient Alignment for LLMs Nov 3, 2024 Thompson Sampling
Code Code Available 4Minimum Empirical Divergence for Sub-Gaussian Linear Bandits Oct 31, 2024 Multi-Armed Bandits Off-policy evaluation
Code Code Available 0Planning and Learning in Risk-Aware Restless Multi-Arm Bandit Problem Oct 30, 2024 Scheduling Thompson Sampling
— Unverified 0