Order-Explicit Linearization of High-Dimensional $U$-Statistics

  • 2026-05-13 17:13:07
  • David M. Ritzwoller, Vasilis Syrgkanis
  • 0

Abstract

We give an order-explicit large deviation bound for the difference between a high-dimensional $U$-statistic and its Hájek projection. In particular, we show that any $U$-statistic of order $b$ on $n$ observations, with a $d$-dimensional kernel whose coordinates have $ψ_1$-Orlicz norm at most $φ$, has a maximum deviation from its Hájek projection of order $O_p(φb n^{-1}\log^2(dn))$. The proof relies on the development of novel order-explicit moment inequalities for higher-order Hoeffding components. We show that this rate is unimprovable, up to the polynomial factor on the logarithmic term. As corollaries, we obtain new Bernstein-type concentration and Gaussian approximation results for high-dimensional $U$-statistics. We apply these results to establish the consistency of a set of resampling-based simultaneous confidence intervals built around a class of nonparametric regression estimators constructed with subsampled kernels. This class encompasses several forms of random forest regression, including Generalized Random Forests.

 

Quick Read (beta)

loading the full paper ...