Abstract
This paper studies the statistical and computational limits of high-orderclustering with planted structures. We focus on two clustering models, constanthigh-order clustering (CHC) and rank-one higher-order clustering (ROHC), andstudy the methods and theories for testing whether a cluster exists (detection)and identifying the support of cluster (recovery). Specifically, we identify sharp boundaries of signal-to-noise ratio for whichCHC and ROHC detection/recovery are statistically possible. We also developtight computational thresholds: when the signal-to-noise ratio is below thesethresholds, we prove that polynomial-time algorithms cannot solve theseproblems under the computational hardness conjectures of hypergraphic plantedclique (HPC) detection and hypergraphic planted dense subgraph (HPDS) recovery.We also propose polynomial-time tensor algorithms that achieve reliabledetection and recovery when the signal-to-noise ratio is above thesethresholds. Both sparsity and tensor structures yield the computationalbarriers in high-order tensor clustering. The interplay between them results insignificant differences between high-order tensor clustering and matrixclustering in literature in aspects of statistical and computational phasetransition diagrams, algorithmic approaches, hardness conjecture, and prooftechniques. To our best knowledge, we are the first to give a thoroughcharacterization of the statistical and computational trade-off for such adouble computational-barrier problem. In addition, we also provide evidence forthe computational hardness conjectures of HPC detection and HPDS recovery.