SOTAVerified

Fast and Sample Efficient Multi-Task Representation Learning in Stochastic Contextual Bandits

2024-10-02Unverified0· sign in to hype

Jiabin Lin, Shana Moothedath, Namrata Vaswani

Unverified — Be the first to reproduce this paper.

Reproduce

Abstract

We study how representation learning can improve the learning efficiency of contextual bandit problems. We study the setting where we play T contextual linear bandits with dimension d simultaneously, and these T bandit tasks collectively share a common linear representation with a dimensionality of r much smaller than d. We present a new algorithm based on alternating projected gradient descent (GD) and minimization estimator to recover a low-rank feature matrix. Using the proposed estimator, we present a multi-task learning algorithm for linear contextual bandits and prove the regret bound of our algorithm. We presented experiments and compared the performance of our algorithm against benchmark algorithms.

Tasks

Reproductions