Abstract
We propose kernel PCA as a method for analyzing the dependence structure ofmultivariate extremes and demonstrate that it can be a powerful tool forclustering and dimension reduction. Our work provides some theoretical insightinto the preimages obtained by kernel PCA, demonstrating that under certainconditions they can effectively identify clusters in the data. We build onthese new insights to characterize rigorously the performance of kernel PCAbased on an extremal sample, i.e., the angular part of random vectors for whichthe radius exceeds a large threshold. More specifically, we focus on theasymptotic dependence of multivariate extremes characterized by the angular orspectral measure in extreme value theory and provide a careful analysis in thecase where the extremes are generated from a linear factor model. We givetheoretical guarantees on the performance of kernel PCA preimages of suchextremes by leveraging their asymptotic distribution together with Davis-Kahanperturbation bounds. Our theoretical findings are complemented with numericalexperiments illustrating the finite sample performance of our methods.