Abstract
We give an order-explicit large deviation bound for the difference between a high-dimensional $U$-statistic and its Hájek projection. In particular, we show that any $U$-statistic of order $b$ on $n$ observations, with a $d$-dimensional kernel whose coordinates have $ψ_1$-Orlicz norm at most $φ$, has a maximum deviation from its Hájek projection of order $O_p(φb n^{-1}\log^2(dn))$. The proof relies on the development of novel order-explicit moment inequalities for higher-order Hoeffding components. We show that this rate is unimprovable, up to the polynomial factor on the logarithmic term. As corollaries, we obtain new Bernstein-type concentration and Gaussian approximation results for high-dimensional $U$-statistics. We apply these results to establish the consistency of a set of resampling-based simultaneous confidence intervals built around a class of nonparametric regression estimators constructed with subsampled kernels. This class encompasses several forms of random forest regression, including Generalized Random Forests.