Hypothesis Generation with Large Language Models Apr 5, 2024 Multi-Armed Bandits
Code Code Available 2Off-Policy Evaluation for Large Action Spaces via Embeddings Feb 13, 2022 Multi-Armed Bandits Off-policy evaluation
Code Code Available 2Doubly Robust Off-Policy Evaluation for Ranking Policies under the Cascade Behavior Model Feb 3, 2022 Multi-Armed Bandits Off-policy evaluation
Code Code Available 2Performance-bounded Online Ensemble Learning Method Based on Multi-armed bandits and Its Applications in Real-time Safety Assessment Mar 19, 2025 Ensemble Learning Multi-Armed Bandits
Code Code Available 1Balans: Multi-Armed Bandits-based Adaptive Large Neighborhood Search for Mixed-Integer Programming Problem Dec 18, 2024 Combinatorial Optimization Multi-Armed Bandits
Code Code Available 1A unifying framework for generalised Bayesian online learning in non-stationary environments Nov 15, 2024 Continual Learning Multi-Armed Bandits
Code Code Available 1LASeR: Learning to Adaptively Select Reward Models with Multi-Armed Bandits Oct 2, 2024 Instruction Following Math
Code Code Available 1Discovering Minimal Reinforcement Learning Environments Jun 18, 2024 continuous-control Continuous Control
Code Code Available 1In-Context Reinforcement Learning for Variable Action Spaces Dec 20, 2023 In-Context Reinforcement Learning Multi-Armed Bandits
Code Code Available 1Equitable Restless Multi-Armed Bandits: A General Framework Inspired By Digital Health Aug 17, 2023 Decision Making Fairness
Code Code Available 1Competing for Shareable Arms in Multi-Player Multi-Armed Bandits May 30, 2023 Multi-Armed Bandits
Code Code Available 1Implicitly normalized forecaster with clipping for linear and non-linear heavy-tailed multi-armed bandits May 11, 2023 Multi-Armed Bandits
Code Code Available 1Neural Exploitation and Exploration of Contextual Bandits May 5, 2023 Multi-Armed Bandits Thompson Sampling
Code Code Available 1Indexability is Not Enough for Whittle: Improved, Near-Optimal Algorithms for Restless Bandits Oct 31, 2022 Multi-Armed Bandits
Code Code Available 1Anytime-valid off-policy inference for contextual bandits Oct 19, 2022 counterfactual Multi-Armed Bandits
Code Code Available 1Multi-agent Dynamic Algorithm Configuration Oct 13, 2022 Multi-Armed Bandits Reinforcement Learning (RL)
Code Code Available 1Transformer Neural Processes: Uncertainty-Aware Meta Learning Via Sequence Modeling Jul 9, 2022 Bayesian Optimization Decision Making
Code Code Available 1Langevin Monte Carlo for Contextual Bandits Jun 22, 2022 Multi-Armed Bandits Thompson Sampling
Code Code Available 1SplitPlace: AI Augmented Splitting and Placement of Large-Scale Neural Networks in Mobile Edge Environments May 21, 2022 Edge-computing Multi-Armed Bandits
Code Code Available 1Pervasive Machine Learning for Smart Radio Environments Enabled by Reconfigurable Intelligent Surfaces May 8, 2022 BIG-bench Machine Learning Deep Reinforcement Learning
Code Code Available 1Offline Neural Contextual Bandits: Pessimism, Optimization and Generalization Nov 27, 2021 Multi-Armed Bandits
Code Code Available 1EE-Net: Exploitation-Exploration Neural Networks in Contextual Bandits Oct 7, 2021 Multi-Armed Bandits Thompson Sampling
Code Code Available 1Generalized Linear Bandits with Local Differential Privacy Jun 7, 2021 Decision Making Multi-Armed Bandits
Code Code Available 1Off-Policy Evaluation via Adaptive Weighting with Data from Contextual Bandits Jun 3, 2021 Multi-Armed Bandits Off-policy evaluation
Code Code Available 1Deep Bandits Show-Off: Simple and Efficient Exploration with Deep Networks May 10, 2021 Efficient Exploration Multi-Armed Bandits
Code Code Available 1Federated Multi-Armed Bandits Jan 28, 2021 Federated Learning Multi-Armed Bandits
Code Code Available 1An empirical evaluation of active inference in multi-armed bandits Jan 21, 2021 BIG-bench Machine Learning Decision Making
Code Code Available 1BanditPAM: Almost Linear Time k-Medoids Clustering via Multi-Armed Bandits Dec 1, 2020 Clustering Multi-Armed Bandits
Code Code Available 1Neural Thompson Sampling Oct 2, 2020 Multi-Armed Bandits Thompson Sampling
Code Code Available 1Carousel Personalization in Music Streaming Apps with Contextual Bandits Sep 14, 2020 Multi-Armed Bandits
Code Code Available 1BanditPAM: Almost Linear Time k-Medoids Clustering via Multi-Armed Bandits Jun 11, 2020 Clustering Multi-Armed Bandits
Code Code Available 1Efficient Contextual Bandits with Continuous Actions Jun 10, 2020 Multi-Armed Bandits
Code Code Available 1Unified Models of Human Behavioral Agents in Bandits, Contextual Bandits and RL May 10, 2020 Decision Making Lifelong learning
Code Code Available 1Hierarchical Adaptive Contextual Bandits for Resource Constraint based Recommendation Apr 2, 2020 Multi-Armed Bandits
Code Code Available 1A Modern Introduction to Online Learning Dec 31, 2019 All Multi-Armed Bandits
Code Code Available 1Multiplayer Multi-armed Bandits for Optimal Assignment in Heterogeneous Networks Jan 12, 2019 Multi-Armed Bandits
Code Code Available 1Deep Reinforcement Learning based Recommendation with Explicit User-Item Interactions Modeling Oct 29, 2018 Collaborative Filtering Decision Making
Code Code Available 1Multi-Armed Bandits With Machine Learning-Generated Surrogate Rewards Jun 20, 2025 Decision Making Under Uncertainty Multi-Armed Bandits
— Unverified 0A General Framework for Off-Policy Learning with Partially-Observed Reward Jun 17, 2025 Multi-Armed Bandits
— Unverified 0Adaptive Data Augmentation for Thompson Sampling Jun 17, 2025 Data Augmentation Multi-Armed Bandits
— Unverified 0Adaptive Action Duration with Contextual Bandits for Deep Reinforcement Learning in Dynamic Environments Jun 17, 2025 Atari Games Board Games
Code Code Available 0Stochastic Multi-Objective Multi-Armed Bandits: Regret Definition and Algorithm Jun 16, 2025 Multi-Armed Bandits
— Unverified 0Collaborative Min-Max Regret in Grouped Multi-Armed Bandits Jun 12, 2025 Multi-Armed Bandits
— Unverified 0Meet Me at the Arm: The Cooperative Multi-Armed Bandits Problem with Shareable Arms Jun 11, 2025 Capacity Estimation Multi-Armed Bandits
— Unverified 0Improved Regret Bounds for Linear Bandits with Heavy-Tailed Rewards Jun 5, 2025 Experimental Design Multi-Armed Bandits
— Unverified 0From Theory to Practice with RAVEN-UCB: Addressing Non-Stationarity in Multi-Armed Bandits through Variance Adaptation Jun 3, 2025 Multi-Armed Bandits
Code Code Available 0VirnyFlow: A Design Space for Responsible Model Development Jun 2, 2025 AutoML Bayesian Optimization
Code Code Available 0Quick-Draw Bandits: Quickly Optimizing in Nonstationary Environments with Extremely Many Arms May 30, 2025 Multi-Armed Bandits
— Unverified 0COBRA: Contextual Bandit Algorithm for Ensuring Truthful Strategic Agents May 29, 2025 Multi-Armed Bandits
— Unverified 0A Reinforcement-Learning-Enhanced LLM Framework for Automated A/B Testing in Personalized Marketing May 27, 2025 Marketing Multi-Armed Bandits
— Unverified 0