SOTAVerified

Multi-Armed Bandits

Multi-armed bandits refer to a task where a fixed amount of resources must be allocated between competing resources that maximizes expected gain. Typically these problems involve an exploration/exploitation trade-off.

( Image credit: Microsoft Research )

Papers

Showing 726750 of 1262 papers

TitleStatusHype
Contextual Decision-Making with Knapsacks Beyond the Worst Case0
On The Statistical Complexity of Offline Decision-Making0
On Top-k Selection in Multi-Armed Bandits and Hidden Bipartite Graphs0
On Universally Optimal Algorithms for A/B Testing0
Open Problem: Best Arm Identification: Almost Instance-Wise Optimality and the Gap Entropy Conjecture0
Open Problem: Model Selection for Contextual Bandits0
Open Problem: Tight Bounds for Kernelized Multi-Armed Bandits with Bernoulli Rewards0
Optimal Activation of Halting Multi-Armed Bandit Models0
Optimal Algorithms for Range Searching over Multi-Armed Bandits0
Optimal Algorithms for Stochastic Contextual Preference Bandits0
Optimal Algorithms for Stochastic Multi-Armed Bandits with Heavy Tailed Rewards0
Optimal and Adaptive Off-policy Evaluation in Contextual Bandits0
Optimal Best-Arm Identification under Fixed Confidence with Multiple Optima0
Optimal cross-learning for contextual bandits with unknown context distributions0
Optimal Multitask Linear Regression and Contextual Bandits under Sparse Heterogeneity0
Optimal Learning for Sequential Decision Making for Expensive Cost Functions with Stochastic Binary Feedbacks0
Towards Costless Model Selection in Contextual Bandits: A Bias-Variance Perspective0
Optimal Multi-Objective Best Arm Identification with Fixed Confidence0
Optimal No-regret Learning in Repeated First-price Auctions0
Optimal Rates of (Locally) Differentially Private Heavy-tailed Multi-Armed Bandits0
Optimal Streaming Algorithms for Multi-Armed Bandits0
Optimistic Information Directed Sampling0
Optimism in the Face of Ambiguity Principle for Multi-Armed Bandits0
Optimizing Online Advertising with Multi-Armed Bandits: Mitigating the Cold Start Problem under Auction Dynamics0
Optimizing Sharpe Ratio: Risk-Adjusted Decision-Making in Multi-Armed Bandits0
Show:102550
← PrevPage 30 of 51Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1NeuralLinear FullPosterior-MRCumulative regret1.92Unverified
2Linear FullPosterior-MRCumulative regret1.82Unverified