SOTAVerified

Multi-Armed Bandits

Multi-armed bandits refer to a task where a fixed amount of resources must be allocated between competing resources that maximizes expected gain. Typically these problems involve an exploration/exploitation trade-off.

( Image credit: Microsoft Research )

Papers

Showing 481490 of 1262 papers

TitleStatusHype
Vertical Federated Linear Contextual Bandits0
Anytime-valid off-policy inference for contextual banditsCode1
Contextual bandits with concave rewards, and an application to fair ranking0
Multi-agent Dynamic Algorithm ConfigurationCode1
Simulated Contextual Bandits for Personalization Tasks from Recommendation DatasetsCode0
Maximum entropy exploration in contextual bandits with neural networks and energy based models0
Constant regret for sequence prediction with limited advice0
ProtoBandit: Efficient Prototype Selection via Multi-Armed Bandits0
Replicable Bandits0
Improved High-Probability Regret for Adversarial Bandits with Time-Varying Feedback Graphs0
Show:102550
← PrevPage 49 of 127Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1NeuralLinear FullPosterior-MRCumulative regret1.92Unverified
2Linear FullPosterior-MRCumulative regret1.82Unverified