Text-based de novo Molecule Generation
Text-based de novo molecule generation involves utilizing natural language processing (NLP) techniques and chemical information to generate entirely new molecular structures. In this approach, molecular structures are typically encoded as text strings, resembling chemical formulas or SMILES (Simplified Molecular Input Line Entry System). Subsequently, by applying NLP models such as recurrent neural networks (RNNs) or Transformer models, these text strings are processed to generate novel molecular structures with desired properties.
Papers
Showing 1–10 of 14 papers
Benchmark Results
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | LDMol | BLEU | 92.6 | — | Unverified |
| 2 | MolReFlect | BLEU | 90.3 | — | Unverified |
| 3 | BioT5+ | BLEU | 87.2 | — | Unverified |
| 4 | BioT5 | BLEU | 86.7 | — | Unverified |
| 5 | MolReGPT (GPT-4-0413) | BLEU | 85.7 | — | Unverified |
| 6 | MolT5-Large | BLEU | 85.4 | — | Unverified |
| 7 | Text+Chem T5-augm base | BLEU | 85.3 | — | Unverified |
| 8 | TGM-DLM w/o corr | BLEU | 82.8 | — | Unverified |
| 9 | TGM-DLM | BLEU | 82.6 | — | Unverified |
| 10 | MolFM-Base | BLEU | 82.2 | — | Unverified |