Predicting Nanoparticle Effects on Small Biomolecule Functionalities Using the Capability of Scikit-learn and PyTorch: A Case Study on Inhibitors of the DNA Damage-Inducible Transcript 3 (CHOP)
Mariya L. Ivanova, Nicola Russo, Gueorgui Mihaylov, Konstantin Nikolic
Unverified — Be the first to reproduce this paper.
ReproduceAbstract
The presented study contributes to ongoing research that aims to overcome challenges in predicting the bio-applicability of nanoparticles (NPs). The approach explored a variety of combinations of nuclear magnetic resonance (NMR) spectroscopy data derived from the Simplified molecular-input line-entry system (SMILES) notations and small biomolecule features. The resulting datasets were utilised for machine learning (ML) with scikit-learn and deep neural networks (DNN) with PyTorch. Despite the obstacles in predicting how NPs influence biomolecule functionalities, the methodology was reasoned in terms of its applicability to compounds both with and without NPs. The methodology was illustrated through a quantitative high-throughput screening (qHTS) aimed at finding DNA Damage-Inducible Transcript 3 (CHOP) inhibitors. Based on this data, the optimal ML performance was obtained by the Random Forest Classifier, which was trained with 19,184 samples and tested with 4,000 achieving 81.1% accuracy, 83.4% precision, 77.7% recall, 80.4% F1-score, 81.1% ROC and 0.821 five-fold cross validation score. Complementing the main research, the paper introduces two computational applications for CHOP inhibition drug discovery. The first approach identifies the most desirable and undesirable functional groups/fragments for CHOP inhibition, with a hypothetical application to nanoformulations (NFs) as well. The second one developed a CID_SID ML model that solely relies on the PubChem identifiers to predict whether an already designed compound possesses CHOP inhibition potential (90.1 % accuracy) and thus contributes to the detection of such a side effect.