T-SCEND: Test-time Scalable MCTS-enhanced Diffusion Model

Abstract

We introduce Test-time Scalable MCTS-enhanced Diffusion Model (T-SCEND), anovel framework that significantly improves diffusion model's reasoningcapabilities with better energy-based training and scaling up test-timecomputation. We first show that na\"ively scaling up inference budget fordiffusion models yields marginal gain. To address this, the training of T-SCENDconsists of a novel linear-regression negative contrastive learning objectiveto improve the performance-energy consistency of the energy landscape, and a KLregularization to reduce adversarial sampling. During inference, T-SCENDintegrates the denoising process with a novel hybrid Monte Carlo Tree Search(hMCTS), which sequentially performs best-of-N random search and MCTS asdenoising proceeds. On challenging reasoning tasks of Maze and Sudoku, wedemonstrate the effectiveness of T-SCEND's training objective and scalableinference method. In particular, trained with Maze sizes of up to $6\times6$,our T-SCEND solves $88\%$ of Maze problems with much larger sizes of$15\times15$, while standard diffusion completely fails. Code to reproduce theexperiments can be found at https://github.com/AI4Science-WestlakeU/t_scend.

Quick Read (beta)

loading the full paper ...