Sparse Data Generation Using Diffusion Models
Phil Ostheimer, Mayank Nagda, Jean Radig, Carl Herrmann, Stephan Mandt, Marius Kloft, Sophie Fellenz
Unverified — Be the first to reproduce this paper.
ReproduceAbstract
Sparse data is ubiquitous, appearing in numerous domains, from economics and recommender systems to astronomy and biomedical sciences. However, efficiently generating high-fidelity synthetic sparse data remains a significant challenge. We introduce Sparse Data Diffusion (SDD), a novel method for generating sparse data. SDD extends continuous state-space diffusion models with an explicit representation of exact zeros by modeling sparsity through the introduction of Sparsity Bits. Empirical validation in various domains, including two scientific applications in physics and biology, demonstrates that SDD achieves high fidelity in representing data sparsity while preserving the quality of the generated data.