T-SCEND: Test-time Scalable MCTS-enhanced Diffusion Model
Tao Zhang, Jia-Shu Pan, Ruiqi Feng, Tailin Wu
Code Available — Be the first to reproduce this paper.
ReproduceCode
- github.com/ai4science-westlakeu/t_scendOfficialIn paperpytorch★ 26
Abstract
We introduce Test-time Scalable MCTS-enhanced Diffusion Model (T-SCEND), a novel framework that significantly improves diffusion model's reasoning capabilities with better energy-based training and scaling up test-time computation. We first show that na\"ively scaling up inference budget for diffusion models yields marginal gain. To address this, the training of T-SCEND consists of a novel linear-regression negative contrastive learning objective to improve the performance-energy consistency of the energy landscape, and a KL regularization to reduce adversarial sampling. During inference, T-SCEND integrates the denoising process with a novel hybrid Monte Carlo Tree Search (hMCTS), which sequentially performs best-of-N random search and MCTS as denoising proceeds. On challenging reasoning tasks of Maze and Sudoku, we demonstrate the effectiveness of T-SCEND's training objective and scalable inference method. In particular, trained with Maze sizes of up to 66, our T-SCEND solves 88\% of Maze problems with much larger sizes of 1515, while standard diffusion completely fails.Code to reproduce the experiments can be found at https://github.com/AI4Science-WestlakeU/t_scend.