Abstract
Graph Neural Networks (GNNs) excel in many graph machine learning tasks butface challenges when scaling to large networks. GNN transferability allowstraining on smaller graphs and applying the model to larger ones, but existingmethods often rely on random subsampling, leading to disconnected subgraphs andreduced model expressivity. We propose a novel graph sampling algorithm thatleverages feature homophily to preserve graph structure. By minimizing thetrace of the data correlation matrix, our method better preserves the graphLaplacian trace -- a proxy for the graph connectivity -- than random sampling,while achieving lower complexity than spectral methods. Experiments on citationnetworks show improved performance in preserving Laplacian trace and GNNtransferability compared to random sampling.