SOTAVerified

Text-based de novo Molecule Generation

Text-based de novo molecule generation involves utilizing natural language processing (NLP) techniques and chemical information to generate entirely new molecular structures. In this approach, molecular structures are typically encoded as text strings, resembling chemical formulas or SMILES (Simplified Molecular Input Line Entry System). Subsequently, by applying NLP models such as recurrent neural networks (RNNs) or Transformer models, these text strings are processed to generate novel molecular structures with desired properties.

Papers

Showing 110 of 14 papers

TitleStatusHype
Automatic Annotation Augmentation Boosts Translation between Molecules and Natural LanguageCode0
Bayesian Flow Is All You Need to Sample Out-of-Distribution Chemical SpacesCode1
MolReFlect: Towards Fine-grained In-Context Alignment between Molecules and Texts0
A Bayesian Flow Network Framework for Chemistry TasksCode1
LDMol: Text-to-Molecule Diffusion Model with Structurally Informative Latent SpaceCode1
BioT5+: Towards Generalized Biological Understanding with IUPAC Integration and Multi-task TuningCode2
Text-Guided Molecule Generation with Diffusion Language ModelCode1
BioT5: Enriching Cross-modal Integration in Biology with Chemical Knowledge and Natural Language AssociationsCode1
GIT-Mol: A Multi-modal Large Language Model for Molecular Science with Graph, Image, and TextCode1
Empowering Molecule Discovery for Molecule-Caption Translation with Large Language Models: A ChatGPT PerspectiveCode1
Show:102550
← PrevPage 1 of 2Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1LDMolBLEU92.6Unverified
2MolReFlectBLEU90.3Unverified
3BioT5+BLEU87.2Unverified
4BioT5BLEU86.7Unverified
5MolReGPT (GPT-4-0413)BLEU85.7Unverified
6MolT5-LargeBLEU85.4Unverified
7Text+Chem T5-augm baseBLEU85.3Unverified
8TGM-DLM w/o corrBLEU82.8Unverified
9TGM-DLMBLEU82.6Unverified
10MolFM-BaseBLEU82.2Unverified