Statistical Optimal Transport via Factored Couplings

Abstract

We propose a new method to estimate Wasserstein distances and optimaltransport plans between two probability distributions from samples in highdimension. Unlike plug-in rules that simply replace the true distributions bytheir empirical counterparts, our method pro- motes couplings with lowtransport rank, a new structural assumption that is similar to the nonnegativerank of a matrix. Regularizing based on this assumption leads to drasticimprovements on high-dimensional data for various tasks, including domainadaptation in single-cell RNA sequencing data. These findings are supported bya theoretical analysis that indicates that the transport rank is key inovercoming the curse of dimensionality inherent to data-driven optimaltransport.

Quick Read (beta)

loading the full paper ...