SOTAVerified

Multi-Armed Bandits

Multi-armed bandits refer to a task where a fixed amount of resources must be allocated between competing resources that maximizes expected gain. Typically these problems involve an exploration/exploitation trade-off.

( Image credit: Microsoft Research )

Papers

Showing 281290 of 1262 papers

TitleStatusHype
A Decision-Language Model (DLM) for Dynamic Restless Multi-Armed Bandit Tasks in Public Health0
Stealthy Adversarial Attacks on Stochastic Multi-Armed Bandits0
Incentivized Exploration via Filtered Posterior Sampling0
Efficient Prompt Optimization Through the Lens of Best Arm Identification0
Diffusion Models Meet Contextual Bandits with Large Action Spaces0
Thompson Sampling in Partially Observable Contextual Bandits0
Thresholding Data Shapley for Data Cleansing Using Multi-Armed Bandits0
FLASH: Federated Learning Across Simultaneous Heterogeneities0
Contextual Multinomial Logit Bandits with General Value Functions0
Efficient Contextual Bandits with Uninformed Feedback Graphs0
Show:102550
← PrevPage 29 of 127Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1NeuralLinear FullPosterior-MRCumulative regret1.92Unverified
2Linear FullPosterior-MRCumulative regret1.82Unverified