Optimizing Posterior Samples for Bayesian Optimization via Rootfinding Oct 29, 2024 Bayesian Optimization global-optimization
Code Code Available 1Bayesian Collaborative Bandits with Thompson Sampling for Improved Outreach in Maternal Health Program Oct 28, 2024 Matrix Completion Thompson Sampling
— Unverified 0BanditCAT and AutoIRT: Machine Learning Approaches to Computerized Adaptive Testing and Item Calibration Oct 28, 2024 AutoML Thompson Sampling
— Unverified 0Robust Thompson Sampling Algorithms Against Reward Poisoning Attacks Oct 25, 2024 Decision Making Sequential Decision Making
— Unverified 0Distributed Thompson sampling under constrained communication Oct 21, 2024 Bayesian Optimization Thompson Sampling
Code Code Available 0Aligning AI Agents via Information-Directed Sampling Oct 18, 2024 Thompson Sampling
— Unverified 0Queueing Matching Bandits with Preference Feedback Oct 14, 2024 Thompson Sampling
Code Code Available 0Combinatorial Multi-armed Bandits: Arm Selection via Group Testing Oct 14, 2024 Multi-Armed Bandits parameter estimation
— Unverified 0Gaussian Process Thompson Sampling via Rootfinding Oct 10, 2024 Bayesian Optimization Decision Making
— Unverified 0Batched Bayesian optimization by maximizing the probability of including the optimum Oct 8, 2024 Bayesian Optimization Diversity
Code Code Available 1Contextual Bandits with Non-Stationary Correlated Rewards for User Association in MmWave Vehicular Networks Oct 8, 2024 Multi-Armed Bandits Thompson Sampling
— Unverified 0Thompson Sampling For Combinatorial Bandits: Polynomial Regret and Mismatched Sampling Paradox Oct 7, 2024 Thompson Sampling
Code Code Available 0Efficient Model-Based Reinforcement Learning Through Optimistic Thompson Sampling Oct 7, 2024 continuous-control Continuous Control
— Unverified 0Improving Portfolio Optimization Results with Bandit Networks Oct 5, 2024 Portfolio Optimization Recommendation Systems
Code Code Available 0Partially Observable Contextual Bandits with Linear Payoffs Sep 17, 2024 Decision Making Multi-Armed Bandits
— Unverified 0Modified Meta-Thompson Sampling for Linear Bandits and Its Bayes Regret Analysis Sep 10, 2024 Meta-Learning Multi-Armed Bandits
— Unverified 0Sliding-Window Thompson Sampling for Non-Stationary Settings Sep 8, 2024 Decision Making Sequential Decision Making
— Unverified 0Multi-Task Combinatorial Bandits for Budget Allocation Aug 31, 2024 Gaussian Processes Marketing
— Unverified 0An Extremely Data-efficient and Generative LLM-based Reinforcement Learning Agent for Recommenders Aug 28, 2024 Recommendation Systems Thompson Sampling
— Unverified 0Improving Thompson Sampling via Information Relaxation for Budgeted Multi-armed Bandits Aug 28, 2024 Multi-Armed Bandits Thompson Sampling
— Unverified 0Contextual Bandit with Herding Effects: Algorithms and Recommendation Applications Aug 26, 2024 Multi-Armed Bandits Thompson Sampling
— Unverified 0Constructing Adversarial Examples for Vertical Federated Learning: Optimal Client Corruption through Multi-Armed Bandit Aug 8, 2024 Federated Learning Thompson Sampling
Code Code Available 0Optimization-Driven Adaptive Experimentation Aug 8, 2024 GPU Thompson Sampling
— Unverified 0Anytime Multi-Agent Path Finding with an Adaptive Delay-Based Heuristic Aug 6, 2024 Multi-Agent Path Finding Self-Learning
Code Code Available 0Process-constrained batch Bayesian approaches for yield optimization in multi-reactor systems Aug 5, 2024 Bayesian Optimization Thompson Sampling
Code Code Available 0Neural Dueling Bandits: Preference-Based Optimization with Human Feedback Jul 24, 2024 Thompson Sampling
— Unverified 0Thompson Sampling Itself is Differentially Private Jul 20, 2024 Thompson Sampling
— Unverified 0Scalable Exploration via Ensemble++ Jul 18, 2024 Computational Efficiency Decision Making
Code Code Available 0DRL-based Joint Resource Scheduling of eMBB and URLLC in O-RAN Jul 16, 2024 Decision Making Deep Reinforcement Learning
— Unverified 0Joint User Association and Pairing in Multi-UAV-Assisted NOMA Networks: A Decaying-Epsilon Thompson Sampling Framework Jun 20, 2024 Thompson Sampling
— Unverified 0Preferential Multi-Objective Bayesian Optimization Jun 20, 2024 Autonomous Driving Bayesian Optimization
— Unverified 0Bayesian Bandit Algorithms with Approximate Inference in Stochastic Linear Bandits Jun 20, 2024 Bayesian Inference Thompson Sampling
— Unverified 0More Efficient Randomized Exploration for Reinforcement Learning via Approximate Sampling Jun 18, 2024 reinforcement-learning Reinforcement Learning
Code Code Available 0Memory Sequence Length of Data Sampling Impacts the Adaptation of Meta-Reinforcement Learning Agents Jun 18, 2024 continuous-control Continuous Control
— Unverified 0Improving Reward-Conditioned Policies for Multi-Armed Bandits using Normalized Weight Functions Jun 16, 2024 Multi-Armed Bandits Policy Gradient Methods
— Unverified 0Graph Neural Thompson Sampling Jun 15, 2024 Decision Making Graph Neural Network
— Unverified 0A Federated Online Restless Bandit Framework for Cooperative Resource Allocation Jun 12, 2024 Federated Learning Multi-Armed Bandits
— Unverified 0DISCO: An End-to-End Bandit Framework for Personalised Discount Allocation Jun 10, 2024 Thompson Sampling
— Unverified 0Two-Stage Resource Allocation in Reconfigurable Intelligent Surface Assisted Hybrid Networks via Multi-Player Bandits Jun 9, 2024 Thompson Sampling
— Unverified 0Adaptively Learning to Select-Rank in Online Platforms Jun 7, 2024 Multi-Armed Bandits Thompson Sampling
— Unverified 0Speculative Decoding via Early-exiting for Faster LLM Inference with Thompson Sampling Control Mechanism Jun 6, 2024 Thompson Sampling
— Unverified 0A Bayesian Approach to Online Planning Jun 4, 2024 Thompson Sampling Uncertainty Quantification
Code Code Available 1Posterior Sampling via Autoregressive Generation May 29, 2024 Articles Decision Making
— Unverified 0Approximate Thompson Sampling for Learning Linear Quadratic Regulators with O(T) Regret May 29, 2024 Thompson Sampling
— Unverified 0Cost-efficient Knowledge-based Question Answering with Large Language Models May 27, 2024 Knowledge Graphs Model Selection
— Unverified 0Code Repair with LLMs gives an Exploration-Exploitation Tradeoff May 26, 2024 Code Repair Language Modeling
— Unverified 0On Bits and Bandits: Quantifying the Regret-Information Trade-off May 26, 2024 Decision Making Question Answering
Code Code Available 0Indexed Minimum Empirical Divergence-Based Algorithms for Linear Bandits May 24, 2024 Multi-Armed Bandits Thompson Sampling
— Unverified 0No Algorithmic Collusion in Two-Player Blindfolded Game with Thompson Sampling May 23, 2024 Thompson Sampling
— Unverified 0Understanding the Training and Generalization of Pretrained Transformer for Sequential Decision Making May 23, 2024 Decision Making Sequential Decision Making
— Unverified 0