Dynamic pricing and assortment under a contextual MNL demand

2021-10-19Unverified0· sign in to hype

Vineet Goyal, Noemie Perivier

Unverified — Be the first to reproduce this paper.

Abstract

We consider dynamic multi-product pricing and assortment problems under an unknown demand over T periods, where in each period, the seller decides on the price for each product or the assortment of products to offer to a customer who chooses according to an unknown Multinomial Logit Model (MNL). Such problems arise in many applications, including online retail and advertising. We propose a randomized dynamic pricing policy based on a variant of the Online Newton Step algorithm (ONS) that achieves a O(dT(T)) regret guarantee under an adversarial arrival model. We also present a new optimistic algorithm for the adversarial MNL contextual bandits problem, which achieves a better dependency than the state-of-the-art algorithms in a problem-dependent constant _2 (potentially exponentially small). Our regret upper bound scales as O(d_2 T+ (T)/_2), which gives a stronger bound than the existing O(dT/_2) guarantees.

Tasks

Multi-Armed Bandits

Dynamic pricing and assortment under a contextual MNL demand

Abstract

Tasks

Reproductions