SOTAVerified

Multi-Armed Bandits

Multi-armed bandits refer to a task where a fixed amount of resources must be allocated between competing resources that maximizes expected gain. Typically these problems involve an exploration/exploitation trade-off.

( Image credit: Microsoft Research )

Papers

Showing 601650 of 1262 papers

TitleStatusHype
Multi-Armed Bandits with Local Differential Privacy0
Multi-Armed Bandits With Machine Learning-Generated Surrogate Rewards0
Multi-Armed Bandits with Metric Movement Costs0
Multi-Armed Bandits with Self-Information Rewards0
Multi-Fidelity Multi-Armed Bandits Revisited0
Multilinguality in LLM-Designed Reward Functions for Restless Bandits: Effects on Task Performance and Fairness0
Multinomial Logit Contextual Bandits: Provable Optimality and Practicality0
Multi-Objective Generalized Linear Bandits0
Multi-Player Approaches for Dueling Bandits0
Multi-Player Bandits: A Trekking Approach0
Multi-Player Bandits Revisited0
Multi-Player Bandits Robust to Adversarial Collisions0
Multiplayer Information Asymmetric Contextual Bandits0
Multi-player Multi-armed Bandits with Collision-Dependent Reward Distributions0
Multi-Player Multi-Armed Bandits with Finite Shareable Resources Arms: Learning Algorithms & Applications0
Decentralized Heterogeneous Multi-Player Multi-Armed Bandits with Non-Zero Rewards on Collisions0
Multiple-Play Stochastic Bandits with Shareable Finite-Capacity Arms0
Multiplier Bootstrap-based Exploration0
MultiScale Contextual Bandits for Long Term Objectives0
Multi-Statistic Approximate Bayesian Computation with Multi-Armed Bandits0
Multi-Task Learning for Contextual Bandits0
Multi-User MABs with User Dependent Rewards for Uncoordinated Spectrum Access0
Multi-User Multi-Armed Bandits for Uncoordinated Spectrum Access0
Navigating the Rashomon Effect: How Personalization Can Help Adjust Interpretable Machine Learning Models to Individual Users0
Nearest Neighbor Search Under Uncertainty0
Nearly Minimax-Optimal Regret for Linearly Parameterized Bandits0
Nearly Optimal Algorithms for Linear Contextual Bandits with Adversarial Corruptions0
Nearly-Optimal Bandit Learning in Stackelberg Games with Side Information0
Towards a Sharp Analysis of Offline Policy Learning for f-Divergence-Regularized Contextual Bandits0
Nearly Optimal Sampling Algorithms for Combinatorial Pure Exploration0
Nearly-tight Approximation Guarantees for the Improving Multi-Armed Bandits Problem0
Nearly Tight Bounds for Cross-Learning Contextual Bandits with Graphical Feedback0
Nearly Tight Bounds for Exploration in Streaming Multi-armed Bandits with Known Optimality Gap0
Near Optimal Best Arm Identification for Clustered Bandits0
Near-Optimal Private Learning in Linear Contextual Bandits0
Networked Restless Multi-Armed Bandits for Mobile Interventions0
Networked Stochastic Multi-Armed Bandits with Combinatorial Strategies0
Neural Bandit with Arm Group Graph0
Neural Collaborative Filtering Bandits via Meta Learning0
Neural Contextual Bandits Based Dynamic Sensor Selection for Low-Power Body-Area Networks0
Neural Contextual Bandits for Personalized Recommendation0
Neural Contextual Bandits Under Delayed Feedback Constraints0
Reward-Biased Maximum Likelihood Estimation for Neural Contextual Bandits0
Neural Contextual Bandits with Deep Representation and Shallow Exploration0
Neural Network Retraining for Model Serving0
Neural Risk-sensitive Satisficing in Contextual Bandits0
NeuralUCB: Contextual Bandits with Neural Network-Based Exploration0
No DBA? No regret! Multi-armed bandits for index tuning of analytical and HTAP workloads with provable guarantees0
Nonlinear Sequential Accepts and Rejects for Identification of Top Arms in Stochastic Bandits0
Nonparametric Contextual Bandits in an Unknown Metric Space0
Show:102550
← PrevPage 13 of 26Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1NeuralLinear FullPosterior-MRCumulative regret1.92Unverified
2Linear FullPosterior-MRCumulative regret1.82Unverified