SOTAVerified

Multi-Armed Bandits

Multi-armed bandits refer to a task where a fixed amount of resources must be allocated between competing resources that maximizes expected gain. Typically these problems involve an exploration/exploitation trade-off.

( Image credit: Microsoft Research )

Papers

Showing 431440 of 1262 papers

TitleStatusHype
Queue Scheduling with Adversarial Bandit Learning0
Multi-Armed Bandits with Generalized Temporally-Partitioned Rewards0
Efficient Explorative Key-term Selection Strategies for Conversational Contextual BanditsCode0
Fairness for Workers Who Pull the Arms: An Index Based Policy for Allocation of Restless Bandit Tasks0
Approximately Stationary Bandits with Knapsacks0
The Choice of Noninformative Priors for Thompson Sampling in Multiparameter Bandit Models0
On Differentially Private Federated Linear Contextual Bandits0
Improved Best-of-Both-Worlds Guarantees for Multi-Armed Bandits: FTRL with General Regularizers and Multiple Optimal Arms0
Kernel Conditional Moment Constraints for Confounding Robust InferenceCode0
Active Velocity Estimation using Light Curtains via Self-Supervised Multi-Armed Bandits0
Show:102550
← PrevPage 44 of 127Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1NeuralLinear FullPosterior-MRCumulative regret1.92Unverified
2Linear FullPosterior-MRCumulative regret1.82Unverified