Counterfactual Inference under Thompson Sampling Apr 3, 2025 Causal Inference counterfactual
— Unverified 0Sparse Nonparametric Contextual Bandits Mar 20, 2025 Multi-Armed Bandits Thompson Sampling
— Unverified 0Bandit-Based Prompt Design Strategy Selection Improves Prompt Optimizers Mar 3, 2025 Prompt Engineering Thompson Sampling
Code Code Available 0Achieving adaptivity and optimality for multi-armed bandits using Exponential-Kullback Leibler Maillard Sampling Feb 20, 2025 Multi-Armed Bandits Thompson Sampling
— Unverified 0An Adversarial Analysis of Thompson Sampling for Full-information Online Learning: from Finite to Infinite Action Spaces Feb 20, 2025 Bayesian Optimization Thompson Sampling
— Unverified 0Uncertainty-Aware Search and Value Models: Mitigating Search Scaling Flaws in LLMs Feb 16, 2025 GSM8K Thompson Sampling
— Unverified 0When and why randomised exploration works (in linear bandits) Feb 13, 2025 Thompson Sampling
— Unverified 0KABB: Knowledge-Aware Bayesian Bandits for Dynamic Expert Coordination in Multi-Agent Systems Feb 11, 2025 Thompson Sampling
— Unverified 0Contextual Thompson Sampling via Generation of Missing Data Feb 10, 2025 Decision Making Fairness
— Unverified 0An Information-Theoretic Analysis of Thompson Sampling with Infinite Action Spaces Feb 4, 2025 Thompson Sampling
— Unverified 0FedRTS: Federated Robust Pruning via Combinatorial Thompson Sampling Jan 31, 2025 Federated Learning Thompson Sampling
Code Code Available 0Active RLHF via Best Policy Learning from Trajectory Preference Feedback Jan 31, 2025 Thompson Sampling
— Unverified 0EVaDE : Event-Based Variational Thompson Sampling for Model-Based Reinforcement Learning Jan 16, 2025 Model-based Reinforcement Learning reinforcement-learning
— Unverified 0Stochastically Constrained Best Arm Identification with Thompson Sampling Jan 7, 2025 Thompson Sampling
— Unverified 0Truthful mechanisms for linear bandit games with private contexts Jan 7, 2025 Thompson Sampling
— Unverified 0WAPTS: A Weighted Allocation Probability Adjusted Thompson Sampling Algorithm for High-Dimensional and Sparse Experiment Settings Jan 7, 2025 Thompson Sampling
— Unverified 0On Improved Regret Bounds In Bayesian Optimization with Gaussian Noise Dec 25, 2024 Bayesian Optimization Thompson Sampling
— Unverified 0Generalized Bayesian deep reinforcement learning Dec 16, 2024 Deep Reinforcement Learning reinforcement-learning
— Unverified 0An Information-Theoretic Analysis of Thompson Sampling for Logistic Bandits Dec 3, 2024 Thompson Sampling
— Unverified 0BOTS: Batch Bayesian Optimization of Extended Thompson Sampling for Severely Episode-Limited RL Settings Nov 30, 2024 Bayesian Optimization Policy Gradient Methods
— Unverified 0Fast, Precise Thompson Sampling for Bayesian Optimization Nov 26, 2024 Bayesian Optimization STS
Code Code Available 0Epinet for Content Cold Start Nov 20, 2024 Recommendation Systems Thompson Sampling
— Unverified 0Minimum Empirical Divergence for Sub-Gaussian Linear Bandits Oct 31, 2024 Multi-Armed Bandits Off-policy evaluation
Code Code Available 0Planning and Learning in Risk-Aware Restless Multi-Arm Bandit Problem Oct 30, 2024 Scheduling Thompson Sampling
— Unverified 0BanditCAT and AutoIRT: Machine Learning Approaches to Computerized Adaptive Testing and Item Calibration Oct 28, 2024 AutoML Thompson Sampling
— Unverified 0Bayesian Collaborative Bandits with Thompson Sampling for Improved Outreach in Maternal Health Program Oct 28, 2024 Matrix Completion Thompson Sampling
— Unverified 0Robust Thompson Sampling Algorithms Against Reward Poisoning Attacks Oct 25, 2024 Decision Making Sequential Decision Making
— Unverified 0Distributed Thompson sampling under constrained communication Oct 21, 2024 Bayesian Optimization Thompson Sampling
Code Code Available 0Aligning AI Agents via Information-Directed Sampling Oct 18, 2024 Thompson Sampling
— Unverified 0Queueing Matching Bandits with Preference Feedback Oct 14, 2024 Thompson Sampling
Code Code Available 0Combinatorial Multi-armed Bandits: Arm Selection via Group Testing Oct 14, 2024 Multi-Armed Bandits parameter estimation
— Unverified 0Gaussian Process Thompson Sampling via Rootfinding Oct 10, 2024 Bayesian Optimization Decision Making
— Unverified 0Contextual Bandits with Non-Stationary Correlated Rewards for User Association in MmWave Vehicular Networks Oct 8, 2024 Multi-Armed Bandits Thompson Sampling
— Unverified 0Efficient Model-Based Reinforcement Learning Through Optimistic Thompson Sampling Oct 7, 2024 continuous-control Continuous Control
— Unverified 0Thompson Sampling For Combinatorial Bandits: Polynomial Regret and Mismatched Sampling Paradox Oct 7, 2024 Thompson Sampling
Code Code Available 0Improving Portfolio Optimization Results with Bandit Networks Oct 5, 2024 Portfolio Optimization Recommendation Systems
Code Code Available 0Partially Observable Contextual Bandits with Linear Payoffs Sep 17, 2024 Decision Making Multi-Armed Bandits
— Unverified 0Modified Meta-Thompson Sampling for Linear Bandits and Its Bayes Regret Analysis Sep 10, 2024 Meta-Learning Multi-Armed Bandits
— Unverified 0Sliding-Window Thompson Sampling for Non-Stationary Settings Sep 8, 2024 Decision Making Sequential Decision Making
— Unverified 0Multi-Task Combinatorial Bandits for Budget Allocation Aug 31, 2024 Gaussian Processes Marketing
— Unverified 0Improving Thompson Sampling via Information Relaxation for Budgeted Multi-armed Bandits Aug 28, 2024 Multi-Armed Bandits Thompson Sampling
— Unverified 0An Extremely Data-efficient and Generative LLM-based Reinforcement Learning Agent for Recommenders Aug 28, 2024 Recommendation Systems Thompson Sampling
— Unverified 0Contextual Bandit with Herding Effects: Algorithms and Recommendation Applications Aug 26, 2024 Multi-Armed Bandits Thompson Sampling
— Unverified 0Constructing Adversarial Examples for Vertical Federated Learning: Optimal Client Corruption through Multi-Armed Bandit Aug 8, 2024 Federated Learning Thompson Sampling
Code Code Available 0Optimization-Driven Adaptive Experimentation Aug 8, 2024 GPU Thompson Sampling
— Unverified 0Anytime Multi-Agent Path Finding with an Adaptive Delay-Based Heuristic Aug 6, 2024 Multi-Agent Path Finding Self-Learning
Code Code Available 0Process-constrained batch Bayesian approaches for yield optimization in multi-reactor systems Aug 5, 2024 Bayesian Optimization Thompson Sampling
Code Code Available 0Neural Dueling Bandits: Preference-Based Optimization with Human Feedback Jul 24, 2024 Thompson Sampling
— Unverified 0Thompson Sampling Itself is Differentially Private Jul 20, 2024 Thompson Sampling
— Unverified 0Scalable Exploration via Ensemble++ Jul 18, 2024 Computational Efficiency Decision Making
Code Code Available 0