SOTAVerified

Multi-Armed Bandits

Multi-armed bandits refer to a task where a fixed amount of resources must be allocated between competing resources that maximizes expected gain. Typically these problems involve an exploration/exploitation trade-off.

( Image credit: Microsoft Research )

Papers

Showing 876900 of 1262 papers

TitleStatusHype
Second Order Bounds for Contextual Bandits with Function Approximation0
Selecting the best system and multi-armed bandits0
Selective Harvesting over Networks0
Selective Intervention Planning using Restless Multi-Armed Bandits to Improve Maternal and Child Health Outcomes0
Selectively Contextual Bandits0
Selective Reviews of Bandit Problems in AI via a Statistical View0
Selfish Robustness and Equilibria in Multi-Player Bandits0
Self-Supervised Contextual Bandits in Computer Vision0
Self-Tuning Bandits over Unknown Covariate-Shifts0
Semantic Parsing for Planning Goals as Constrained Combinatorial Contextual Bandits0
Semi-Parametric Batched Global Multi-Armed Bandits with Covariates0
Semi-Parametric Contextual Bandits with Graph-Laplacian Regularization0
Sequential Batch Learning in Finite-Action Linear Contextual Bandits0
Sequential Best-Arm Identification with Application to Brain-Computer Interface0
Constrained Restless Bandits for Dynamic Scheduling in Cyber-Physical Systems0
Sequential Design for Ranking Response Surfaces0
Sequential Monte Carlo Bandits0
Settling the Communication Complexity for Distributed Offline Reinforcement Learning0
SHAP@k:Efficient and Probably Approximately Correct (PAC) Identification of Top-k Features0
Sharp Analysis for KL-Regularized Contextual Bandits and RLHF0
Sharp Deviations Bounds for Dirichlet Weighted Sums with Application to analysis of Bayesian algorithms0
Shuffle Private Linear Contextual Bandits0
Simple Regret Minimization for Contextual Bandits0
Simultaneously Learning Stochastic and Adversarial Episodic MDPs with Known Transition0
Skyline Identification in Multi-Armed Bandits0
Show:102550
← PrevPage 36 of 51Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1NeuralLinear FullPosterior-MRCumulative regret1.92Unverified
2Linear FullPosterior-MRCumulative regret1.82Unverified