Design-unbiased statistical learning in survey sampling

Abstract

Design-consistent model-assisted estimation has become the standard practicein survey sampling. However, a general theory is lacking so far, which allowsone to incorporate modern machine-learning techniques that can lead topotentially much more powerful assisting models. We propose a subsamplingRao-Blackwell method, and develop a statistical learning theory for exactlydesign-unbiased estimation with the help of linear or non-linear predictionmodels. Our approach makes use of classic ideas from Statistical Science aswell as the rapidly growing field of Machine Learning. Provided rich auxiliaryinformation, it can yield considerable efficiency gains over standard linearmodel-assisted methods, while ensuring valid estimation for the given targetpopulation, which is robust against potential mis-specifications of theassisting model at the individual level.

Quick Read (beta)

loading the full paper ...