Contextual bandits with surrogate losses: Margin bounds and efficient algorithms

2018-06-28NeurIPS 2018Unverified0· sign in to hype

Dylan J. Foster, Akshay Krishnamurthy

Unverified — Be the first to reproduce this paper.

Abstract

We use surrogate losses to obtain several new regret bounds and new algorithms for contextual bandit learning. Using the ramp loss, we derive new margin-based regret bounds in terms of standard sequential complexity measures of a benchmark class of real-valued regression functions. Using the hinge loss, we derive an efficient algorithm with a dT-type mistake bound against benchmark policies induced by d-dimensional regressors. Under realizability assumptions, our results also yield classical regret bounds.

Tasks

Multi-Armed Bandits regression

Contextual bandits with surrogate losses: Margin bounds and efficient algorithms

Abstract

Tasks

Reproductions