SOTAVerified

Multi-Armed Bandits

Multi-armed bandits refer to a task where a fixed amount of resources must be allocated between competing resources that maximizes expected gain. Typically these problems involve an exploration/exploitation trade-off.

( Image credit: Microsoft Research )

Papers

Showing 176200 of 1262 papers

TitleStatusHype
Bayesian Design Principles for Frequentist Sequential LearningCode0
Bayesian Optimisation over Multiple Continuous and Categorical InputsCode0
Kernel Conditional Moment Constraints for Confounding Robust InferenceCode0
Scalable Exploration via Ensemble++Code0
Causally Abstracted Multi-armed BanditsCode0
Learning Contextual Bandits in a Non-stationary EnvironmentCode0
Learning Structural Weight Uncertainty for Sequential Decision-MakingCode0
Locally Differentially Private (Contextual) Bandits LearningCode0
Locally Private Nonparametric Contextual Multi-armed BanditsCode0
Budgeted Multi-Armed Bandits with Asymmetric Confidence IntervalsCode0
Low-Rank Bandits via Tight Two-to-Infinity Singular Subspace RecoveryCode0
Adaptive Linear Estimating EquationsCode0
Master-slave Deep Architecture for Top-K Multi-armed Bandits with Non-linear Bandit Feedback and Diversity ConstraintsCode0
Best Arm Identification with Fixed Budget: A Large Deviation PerspectiveCode0
Meta-in-context learning in large language modelsCode0
Empirical analysis of representation learning and exploration in neural kernel banditsCode0
Multi-agent Multi-armed Bandits with Minimum Reward Guarantee FairnessCode0
Distribution oblivious, risk-aware algorithms for multi-armed bandits with unbounded rewardsCode0
Multi-Armed Bandits in Brain-Computer InterfacesCode0
Bandit-Based Monte Carlo Optimization for Nearest NeighborsCode0
Multi-Armed Bandits with Network InterferenceCode0
An Experimental Design for Anytime-Valid Causal Inference on Multi-Armed BanditsCode0
Myopic Bayesian Design of Experiments via Posterior Sampling and Probabilistic ProgrammingCode0
Model selection for contextual banditsCode0
Censored Semi-Bandits: A Framework for Resource Allocation with Censored FeedbackCode0
Show:102550
← PrevPage 8 of 51Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1NeuralLinear FullPosterior-MRCumulative regret1.92Unverified
2Linear FullPosterior-MRCumulative regret1.82Unverified