SOTAVerified

Unconditional Molecule Generation

This task evaluates the ability of generative models to sample valid and realistic molecular structures.

The training dataset can be:

  • QM9 (Wu et al., 2018) - consists of 130,000 stable small organic molecules containing up to nine heavy atoms (C, N, O, F) along with hydrogens.
  • GEOM-DRUGS (Axelrod and Gómez-Bombarelli, 2022) - consistes of 430,000 large organic molecules of up to 180 atoms.

Following prior work (Hoogeboom et al., 2022), we generally sample 10,000 molecules and compute validity, uniqueness and Posebusters sanity checks (Buttenschoen et al., 2024) for molecules. Data is generally split following prior work (Hoogeboom et al., 2022, Vignac et al., 2023) to ensure fair comparisons.

Papers

No papers found.

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1TABASCOPoseBusters Validity92Unverified
2SemlaFlowPoseBusters Validity87.5Unverified
3ADiTPoseBusters Validity85.3Unverified
4MiDiValidity77.8Unverified
5EQGAT-diffPoseBusters Validity59.7Unverified
#ModelMetricClaimedVerifiedStatus
1ADiTValidity94.45Unverified
2GeoLDMValidity93.8Unverified
3EDMValidity91.9Unverified
4SymphonyValidity83.5Unverified