SOTAVerified

RadDiff: Retrieval-Augmented Denoising Diffusion for Protein Inverse Folding

2026-03-09Unverified0· sign in to hype

Jin Han, Tianfan Fu, Wu-Jun Li

Unverified — Be the first to reproduce this paper.

Reproduce

Abstract

Protein inverse folding, the design of an amino acid sequence based on a target protein structure, is a fundamental problem of computational protein engineering. Existing methods either generate sequences without leveraging external knowledge or relying on protein language models~(PLMs). The former omits the knowledge stored in natural protein data, while the latter is parameter-inefficient and inflexible to adapt to ever-growing protein data. To overcome the above drawbacks, in this paper we propose a novel method, called retrieval-augmented denoising diffusion~(RadDiff), for protein inverse folding. In RadDiff, a novel retrieval-augmentation mechanism is designed to capture the up-to-date protein knowledge. We further design a knowledge-aware diffusion model that integrates this protein knowledge into the diffusion process via a lightweight module. Experimental results on the CATH, TS50, and PDB2022 datasets show that RadDiff consistently outperforms existing methods, improving sequence recovery rate by up to 19\%. Experimental results also demonstrate that RadDiff generates highly foldable sequences and scales effectively with database size.

Reproductions