Hierarchical Functional Group Ranking via IUPAC Name Analysis for Drug Discovery: A Case Study on TDP1 Inhibitors
Mariya L. Ivanova, Nicola Russo, Konstantin Nikolic
Unverified — Be the first to reproduce this paper.
ReproduceAbstract
The article proposes a computational approach that can generate a descending order of the IUPAC-notated functional groups based on their importance for a given case study. Thus, a reduced list of functional groups could be obtained from which drug discovery can be successfully initiated. The approach, applicable to any study case with sufficient data, was demonstrated using a PubChem bioassay focused on TDP1 inhibitors. The Scikit Learn interpretation of the Random Forest Classifier (RFC) algorithm was employed. The machine learning (ML) model RFC obtained 70.9% accuracy, 73.1% precision, 66.1% recall, 69.4% F1 and 70.8% receiver-operating characteristic (ROC). In addition to the main study, the CID_SID ML model was developed, which, using only the PubChem compound and substance identifiers (CIDs and SIDs) data, can predict with 85.2% accuracy, 94.2% precision, 75% precision, F1 of 83.5% F1 and 85.2% ROC whether a compound is a TDP1 inhibitor.