Unconditional Molecule Generation
This task evaluates the ability of generative models to sample valid and realistic molecular structures.
The training dataset can be:
- QM9 (Wu et al., 2018) - consists of 130,000 stable small organic molecules containing up to nine heavy atoms (C, N, O, F) along with hydrogens.
- GEOM-DRUGS (Axelrod and Gómez-Bombarelli, 2022) - consistes of 430,000 large organic molecules of up to 180 atoms.
Following prior work (Hoogeboom et al., 2022), we generally sample 10,000 molecules and compute validity, uniqueness and Posebusters sanity checks (Buttenschoen et al., 2024) for molecules. Data is generally split following prior work (Hoogeboom et al., 2022, Vignac et al., 2023) to ensure fair comparisons.
Papers
Showing 1–8 of 8 papers
Benchmark Results
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | TABASCO | PoseBusters Validity | 92 | — | Unverified |
| 2 | SemlaFlow | PoseBusters Validity | 87.5 | — | Unverified |
| 3 | ADiT | PoseBusters Validity | 85.3 | — | Unverified |
| 4 | MiDi | Validity | 77.8 | — | Unverified |
| 5 | EQGAT-diff | PoseBusters Validity | 59.7 | — | Unverified |