The graph convolutional networks (GCN) recently proposed by Kipf and Wellingare an effective graph model for semi-supervised learning. This model, however,was originally designed to be learned with the presence of both training andtest data. Moreover, the recursive neighborhood expansion across layers posestime and memory challenges for training with large, dense graphs. To relax therequirement of simultaneous availability of test data, we interpret graphconvolutions as integral transforms of embedding functions under probabilitymeasures. Such an interpretation allows for the use of Monte Carlo approachesto consistently estimate the integrals, which in turn leads to a batchedtraining scheme as we propose in this work---FastGCN. Enhanced with importancesampling, FastGCN not only is efficient for training but also generalizes wellfor inference. We show a comprehensive set of experiments to demonstrate itseffectiveness compared with GCN and related models. In particular, training isorders of magnitude more efficient while predictions remain comparablyaccurate.