RETRO SYNFLOW: Discrete Flow Matching for Accurate and Diverse Single-Step Retrosynthesis
Robin Yadav, Qi Yan, Guy Wolf, Avishek Joey Bose, Renjie Liao
Unverified — Be the first to reproduce this paper.
ReproduceAbstract
A fundamental problem in organic chemistry is identifying and predicting the series of reactions that synthesize a desired target product molecule. Due to the combinatorial nature of the chemical search space, single-step reactant prediction -- i.e. single-step retrosynthesis -- remains challenging even for existing state-of-the-art template-free generative approaches to produce an accurate yet diverse set of feasible reactions. In this paper, we model single-step retrosynthesis planning and introduce RETRO SYNFLOW (RSF) a discrete flow-matching framework that builds a Markov bridge between the prescribed target product molecule and the reactant molecule. In contrast to past approaches, RSF employs a reaction center identification step to produce intermediate structures known as synthons as a more informative source distribution for the discrete flow. To further enhance diversity and feasibility of generated samples, we employ Feynman-Kac steering with Sequential Monte Carlo based resampling to steer promising generations at inference using a new reward oracle that relies on a forward-synthesis model. Empirically, we demonstrate achieves 60.0 \% top-1 accuracy, which outperforms the previous SOTA by 20 \%. We also substantiate the benefits of steering at inference and demonstrate that FK-steering improves top-5 round-trip accuracy by 19 \% over prior template-free SOTA methods, all while preserving competitive top-k accuracy results.