Efficient Linear Bandits through Matrix Sketching

2018-09-28Unverified0· sign in to hype

Ilja Kuzborskij, Leonardo Cella, Nicolò Cesa-Bianchi

Unverified — Be the first to reproduce this paper.

Abstract

We prove that two popular linear contextual bandit algorithms, OFUL and Thompson Sampling, can be made efficient using Frequent Directions, a deterministic online sketching technique. More precisely, we show that a sketch of size m allows a O(md) update time for both algorithms, as opposed to (d^2) required by their non-sketched versions in general (where d is the dimension of context vectors). This computational speedup is accompanied by regret bounds of order (1+_m)^3/2dT for OFUL and of order ((1+_m)d)^3/2T for Thompson Sampling, where _m is bounded by the sum of the tail eigenvalues not covered by the sketch. In particular, when the selected contexts span a subspace of dimension at most m, our algorithms have a regret bound matching that of their slower, non-sketched counterparts. Experiments on real-world datasets corroborate our theoretical results.

Tasks

Thompson Sampling

Efficient Linear Bandits through Matrix Sketching

Abstract

Tasks

Reproductions