Gaussian process regression networks (GPRN) are powerful Bayesian models formulti-output regression, but their inference is intractable. To address thisissue, existing methods use a fully factorized structure (or a mixture of suchstructures) over all the outputs and latent functions for posteriorapproximation, which, however, can miss the strong posterior dependencies amongthe latent variables and hurt the inference quality. In addition, the updatesof the variational parameters are inefficient and can be prohibitivelyexpensive for a large number of outputs. To overcome these limitations, wepropose a scalable variational inference algorithm for GPRN, which not onlycaptures the abundant posterior dependencies but also is much more efficientfor massive outputs. We tensorize the output space and introducetensor/matrix-normal variational posteriors to capture the posteriorcorrelations and to reduce the parameters. We jointly optimize all theparameters and exploit the inherent Kronecker product structure in thevariational model evidence lower bound to accelerate the computation. Wedemonstrate the advantages of our method in several real-world applications.