Denoising score matching with Annealed Langevin Sampling (DSM-ALS) is arecent approach to generative modeling. Despite the convincing visual qualityof samples, this method appears to perform worse than Generative AdversarialNetworks (GANs) under the Fr\'echet Inception Distance, a popular metric forgenerative models. We show that this apparent gap vanishes when denoising thefinal Langevin samples using the score network. In addition, we propose twoimprovements to DSM-ALS: 1) Consistent Annealed Sampling as a more stablealternative to Annealed Langevin Sampling, and 2) a hybrid trainingformulation,composed of both denoising score matching and adversarialobjectives. By combining both of these techniques and exploring differentnetwork architectures, we elevate score matching methods and obtain resultscompetitive with state-of-the-art image generation on CIFAR-10.