Abstract
The goal in thinning is to summarize a dataset using a small set ofrepresentative points. Remarkably, sub-Gaussian thinning algorithms like KernelHalving and Compress can match the quality of uniform subsampling whilesubstantially reducing the number of summary points. However, existingguarantees cover only a restricted range of distributions and kernel-basedquality measures and suffer from pessimistic dimension dependence. To addressthese deficiencies, we introduce a new low-rank analysis of sub-Gaussianthinning that applies to any distribution and any kernel, guaranteeinghigh-quality compression whenever the kernel or data matrix is approximatelylow-rank. To demonstrate the broad applicability of the techniques, we designpractical sub-Gaussian thinning approaches that improve upon the best knownguarantees for approximating attention in transformers, accelerating stochasticgradient training through reordering, and distinguishing distributions innear-linear time.