Continuous Assortment Optimization with Logit Choice Probabilities under Incomplete Information

Abstract

We consider assortment optimization in relation to a product for which aparticular attribute can be continuously adjusted. Examples include theduration of a loan (where each duration corresponds to a specific interestrate) and the data limit for a cell phone subscription. The question to beaddressed is: how should a retailer determine what to offer to maximize profit?Representing the assortment as a union of subintervals, the choice of acustomer is modelled as a continuous logit choice model; a capacity constraintis imposed on the assortment. The problem can be phrased as a multi-armedbandit, i.e., the objective is to estimate demand over time by sequentiallyoffering different assortments to incoming costumers. Kernel density estimationis applied to the observed purchases. We present an explore-then-exploitpolicy, which endures at most a regret of order $T^{2/3}$ (neglectinglogarithmic factors). Also, by showing that any policy in the worst case mustendure at least a regret of order $T^{2/3}$, we conclude that our policy can beregarded as asymptotically optimal.

Quick Read (beta)

loading the full paper ...