Variance reduction combining pre-experiment and in-experiment data

Abstract

Online controlled experiments (A/B testing) are essential in data-drivendecision-making for many companies. Increasing the sensitivity of theseexperiments, particularly with a fixed sample size, relies on reducing thevariance of the estimator for the average treatment effect (ATE). Existingmethods like CUPED and CUPAC use pre-experiment data to reduce variance, buttheir effectiveness depends on the correlation between the pre-experiment dataand the outcome. In contrast, in-experiment data is often more stronglycorrelated with the outcome and thus more informative. In this paper, weintroduce a novel method that combines both pre-experiment and in-experimentdata to achieve greater variance reduction than CUPED and CUPAC, withoutintroducing bias or additional computation complexity. We also establishasymptotic theory and provide consistent variance estimators for our method.Applying this method to multiple online experiments at Etsy, we reachsubstantial variance reduction over CUPAC with the inclusion of only a fewin-experiment covariates. These results highlight the potential of our approachto significantly improve experiment sensitivity and accelerate decision-making.

Quick Read (beta)

loading the full paper ...