Leverage Score Sampling for Tensor Product Matrices in Input Sparsity Time

  • 2022-06-24 16:37:39
  • David P. Woodruff, Amir Zandieh
  • 0

Abstract

We propose an input sparsity time sampling algorithm that can spectrallyapproximate the Gram matrix corresponding to the $q$-fold column-wise tensorproduct of $q$ matrices using a nearly optimal number of samples, improvingupon all previously known methods by poly$(q)$ factors. Furthermore, for theimportant special case of the $q$-fold self-tensoring of a dataset, which isthe feature matrix of the degree-$q$ polynomial kernel, the leading term of ourmethod's runtime is proportional to the size of the input dataset and has nodependence on $q$. Previous techniques either incur poly$(q)$ slowdowns intheir runtime or remove the dependence on $q$ at the expense of havingsub-optimal target dimension, and depend quadratically on the number ofdata-points in their runtime. Our sampling technique relies on a collection of$q$ partially correlated random projections which can be simultaneously appliedto a dataset $X$ in total time that only depends on the size of $X$, and at thesame time their $q$-fold Kronecker product acts as a near-isometry for anyfixed vector in the column span of $X^{\otimes q}$. We also show that oursampling methods generalize to other classes of kernels beyond polynomial, suchas Gaussian and Neural Tangent kernels.

 

Quick Read (beta)

loading the full paper ...