Multi-Armed Bandits With Machine Learning-Generated Surrogate Rewards Jun 20, 2025 Decision Making Under Uncertainty Multi-Armed Bandits
— Unverified 0Adaptive Action Duration with Contextual Bandits for Deep Reinforcement Learning in Dynamic Environments Jun 17, 2025 Atari Games Board Games
Code Code Available 0A General Framework for Off-Policy Learning with Partially-Observed Reward Jun 17, 2025 Multi-Armed Bandits
— Unverified 0Adaptive Data Augmentation for Thompson Sampling Jun 17, 2025 Data Augmentation Multi-Armed Bandits
— Unverified 0Stochastic Multi-Objective Multi-Armed Bandits: Regret Definition and Algorithm Jun 16, 2025 Multi-Armed Bandits
— Unverified 0Collaborative Min-Max Regret in Grouped Multi-Armed Bandits Jun 12, 2025 Multi-Armed Bandits
— Unverified 0Meet Me at the Arm: The Cooperative Multi-Armed Bandits Problem with Shareable Arms Jun 11, 2025 Capacity Estimation Multi-Armed Bandits
— Unverified 0Improved Regret Bounds for Linear Bandits with Heavy-Tailed Rewards Jun 5, 2025 Experimental Design Multi-Armed Bandits
— Unverified 0From Theory to Practice with RAVEN-UCB: Addressing Non-Stationarity in Multi-Armed Bandits through Variance Adaptation Jun 3, 2025 Multi-Armed Bandits
Code Code Available 0VirnyFlow: A Design Space for Responsible Model Development Jun 2, 2025 AutoML Bayesian Optimization
Code Code Available 0Quick-Draw Bandits: Quickly Optimizing in Nonstationary Environments with Extremely Many Arms May 30, 2025 Multi-Armed Bandits
— Unverified 0COBRA: Contextual Bandit Algorithm for Ensuring Truthful Strategic Agents May 29, 2025 Multi-Armed Bandits
— Unverified 0A Reinforcement-Learning-Enhanced LLM Framework for Automated A/B Testing in Personalized Marketing May 27, 2025 Marketing Multi-Armed Bandits
— Unverified 0Offline Clustering of Linear Bandits: Unlocking the Power of Clusters in Data-Limited Environments May 25, 2025 Clustering Multi-Armed Bandits
— Unverified 0Test-Time Scaling of Diffusion Models via Noise Trajectory Search May 24, 2025 Denoising Image Generation
Code Code Available 0KL-regularization Itself is Differentially Private in Bandits and RLHF May 23, 2025 Decision Making Multi-Armed Bandits
— Unverified 0Scalable and Interpretable Contextual Bandits: A Literature Review and Retail Offer Prototype May 22, 2025 Feature Engineering Large Language Model
— Unverified 0In-Domain African Languages Translation Using LLMs and Multi-armed Bandits May 21, 2025 Domain Adaptation Machine Translation
— Unverified 0Optimal Best-Arm Identification under Fixed Confidence with Multiple Optima May 21, 2025 Multi-Armed Bandits
— Unverified 0Human in the Loop Adaptive Optimization for Improved Time Series Forecasting May 21, 2025 Language Modeling Language Modelling
Code Code Available 0High-dimensional Nonparametric Contextual Bandit Problem May 20, 2025 Decision Making Multi-Armed Bandits
— Unverified 0Augmenting Online RL with Offline Data is All You Need: A Unified Hybrid RL Algorithm Design and Analysis May 19, 2025 All Multi-Armed Bandits
— Unverified 0Multi-Armed Bandits Meet Large Language Models May 19, 2025 Decision Making Multi-Armed Bandits
— Unverified 0Near Optimal Best Arm Identification for Clustered Bandits May 15, 2025 Clustering Computational Efficiency
— Unverified 0Batched Nonparametric Bandits via k-Nearest Neighbor UCB May 15, 2025 Decision Making Marketing
— Unverified 0Adaptive, Robust and Scalable Bayesian Filtering for Online Learning May 12, 2025 Continual Learning Multi-Armed Bandits
— Unverified 0Navigating the Rashomon Effect: How Personalization Can Help Adjust Interpretable Machine Learning Models to Individual Users May 11, 2025 Additive models Interpretable Machine Learning
— Unverified 0Adaptive Budgeted Multi-Armed Bandits for IoT with Dynamic Resource Constraints May 5, 2025 Multi-Armed Bandits
— Unverified 0Preference-centric Bandits: Optimality of Mixtures and Regret-efficient Algorithms Apr 29, 2025 Multi-Armed Bandits Navigate
— Unverified 0Access Probability Optimization in RACH: A Multi-Armed Bandits Approach Apr 18, 2025 Multi-Armed Bandits
— Unverified 0Neural Contextual Bandits Under Delayed Feedback Constraints Apr 16, 2025 Multi-Armed Bandits Recommendation Systems
— Unverified 0On the Problem of Best Arm Retention Apr 16, 2025 Multi-Armed Bandits
— Unverified 0Learning-Based User Association for MmWave Vehicular Networks With Kernelized Contextual Bandits Apr 15, 2025 Multi-Armed Bandits
— Unverified 0Towards More Efficient, Robust, Instance-adaptive, and Generalizable Sequential Decision making Apr 12, 2025 Decision Making Decision Making Under Uncertainty
— Unverified 0A Classification View on Meta Learning Bandits Apr 6, 2025 Classification Meta-Learning
— Unverified 0An Exploration-free Method for a Linear Stochastic Bandit Driven by a Linear Gaussian Dynamical System Apr 4, 2025 Hyperparameter Optimization Multi-Armed Bandits
— Unverified 0Antithetic Sampling for Top-k Shapley Identification Apr 2, 2025 Multi-Armed Bandits
Code Code Available 0Efficient Near-Optimal Algorithm for Online Shortest Paths in Directed Acyclic Graphs with Bandit Feedback Against Adaptive Adversaries Apr 1, 2025 Multi-Armed Bandits
— Unverified 0Reinforcement Learning for Machine Learning Model Deployment: Evaluating Multi-Armed Bandits in ML Ops Environments Mar 28, 2025 Management Model Selection
— Unverified 0MultiScale Contextual Bandits for Long Term Objectives Mar 22, 2025 Multi-Armed Bandits Recommendation Systems
— Unverified 0Sparse Additive Contextual Bandits: A Nonparametric Approach for Online Decision-making with High-dimensional Covariates Mar 21, 2025 Decision Making Multi-Armed Bandits
— Unverified 0NeuroSep-CP-LCB: A Deep Learning-based Contextual Multi-armed Bandit Algorithm with Uncertainty Quantification for Early Sepsis Prediction Mar 20, 2025 Conformal Prediction Decision Making
Code Code Available 0Sparse Nonparametric Contextual Bandits Mar 20, 2025 Multi-Armed Bandits Thompson Sampling
— Unverified 0Performance-bounded Online Ensemble Learning Method Based on Multi-armed bandits and Its Applications in Real-time Safety Assessment Mar 19, 2025 Ensemble Learning Multi-Armed Bandits
Code Code Available 1A New Benchmark for Online Learning with Budget-Balancing Constraints Mar 19, 2025 Multi-Armed Bandits
— Unverified 0Variance-Dependent Regret Lower Bounds for Contextual Bandits Mar 15, 2025 Multi-Armed Bandits
— Unverified 0Bi-Criteria Optimization for Combinatorial Bandits: Sublinear Regret and Constraint Violation under Bandit Feedback Mar 15, 2025 Multi-Armed Bandits
— Unverified 0Locally Private Nonparametric Contextual Multi-armed Bandits Mar 11, 2025 Decision Making Multi-Armed Bandits
Code Code Available 0Multiplayer Information Asymmetric Contextual Bandits Mar 11, 2025 Multi-Armed Bandits
— Unverified 0Cost-Aware Optimal Pairwise Pure Exploration Mar 10, 2025 Multi-Armed Bandits
— Unverified 0