Predictive model for identifying new CYP19A1 ligands on the KNIME analytical platform
https://doi.org/10.29235/1561-8323-2023-67-5-388-398
Abstract
The purpose of this study was to create a database of the chemical compounds – ligands of human steroid-hydroxylating cytochrome CYP19A1 (aromatase) in order to build a predictive model.
The idea was to create a model on the basis of the machinery learning method such as random forest for two types of ligands – with steroidal (I type) and non-steroidal structure (II type). Two predictive models were built with the help of the KNIME analytical platform. Topological descriptors of the chemical structure were used as training data when building a model that takes into account their correlation between the structure of the molecule and the biological effect. The selection of the feature importance of the descriptors, optimal parameters of random forest and the definition of applicability domain of the models were carried out. The assessment of the ability to predict the results of a test sample was performed for each model. The quality marks of the obtained models indicated a rather high predictive ability of the models and the prospects of their use for identification of new human CYP19A1 ligands as potential drugs for treatment of hormone-dependent tumors.
About the Authors
M. I. ShaladonovaBelarus
Marina I. Shaladonova, Master’s Student
220070
38B, Radialnaya Str.
Minsk
Ya. V. Dzichenka
Belarus
Yaraslau V. Dzichenka, Ph. D. (Chemistry), Associate Professor, Leading Researcher
220084
5/2, Kuprevich Str.
Minsk
S. A. Usanov
Belarus
Sergei A. Usanov, Corresponding Memberr, D. Sc. (Chemistry), Professor
220084
5/2, Kuprevich Str.
Minsk
References
1. Guha R., Jurs P. C. Development of Linear, Ensemble, and Nonlinear Models for the Prediction and Interpretation of the Biological Activity of a Set of PDGFR Inhibitors. Journal of Chemical Information and Computer Sciences, 2004, vol. 44, no. 6, pp. 2179–2189. doi: 10.1021/ci049849f
2. Chamduang C., Pingaew R., Prachayasittikul V., Prachayasittikul S., Ruchirawat S., Prahayasit tikul V. Novel triazole-tetrahydroisoquinoline hybrids as human aromatase inhibitors. Bioorganic Chemistry, 2019, vol. 93, art. 103327. doi: 10.1016/j.bioorg.2019.103327
3. Brueggemeier R. W., Hackett J. C., Diaz-Cruz E. S. Aromatase Inhibitors in the Treatment of Breast Cancer. Endocrine Reviews, 2005, vol. 26, no. 3, pp. 331–345. doi: 10.1210/er.2004-0015
4. Bertelli G. Sequencing of aromatase inhibitors. British Journal of Cancer, 2005, vol. 93, no. S1, pp. 6–9. doi: 10.1038/sj.bjc.6602689
5. Sahin Z., Ertas M., Berk B., Biltekin S. N., Yurttas L., Demirayak S. Studies on non-steroidal inhibitors of aromatase enzyme; 4-(aryl/heteroaryl)-2-(pyrimidin-2-yl)thiazole derivatives. Bioorganic & Medicinal Chemistry, 2018, vol. 26, no. 8, pp. 1986–1995. doi: 10.1016/j.bmc.2018.02.048
6. Avvaru S. P., Noolvi M. N., Aminbhavi T. M., Chkraborty S., Dash A., Shukla S. S. Aromatase Inhibitors Evolution as Potential Class of Drugs in the Treatment of Postmenopausal Brest Cancer Women. Mini-Reviews in Medicinal Chemistry, 2018, vol. 18, no. 7, pp. 609–621. doi: 10.2174/1389557517666171101100902
7. Raymond L., Rayani N., Polson G., Sikorski K., Lian A., VanAlstine-Parris M. A. Determining the IC50 Values for Vorozole and Letrozole, on a Series of Human Liver Cytochrome P450s, to Help Determine the Binding Site of Vorozole in the Liver. Enzyme Research, 2015, vol. 2015, pp. 1–4. doi: 10.1155/2015/321820
8. Bubert C., Woo L. W. L., Sutcliffe O. B., Mahon M. F., Chander S. K., Purohit A., Reed M. J., Potter B. V. L. Synthesis of Aromatase Inhibitors and Dual Aromatase Steroid Sulfatase Inhibitors by Linking an Arylsulfamate Motif to 4-(4H-1,2,4-triazol-4-ylamino)benzonitrile: SAR, Crystal Structures, in vitro and in vivo Activities. ChemMedChem, 2008, vol. 3, no. 11, pp. 1708–1730. doi: 10.1002/cmdc.200800164
9. Baskin I. I., Madzhilov T. I., Varnek A. A. Introduction to Chemoinformatics. Vol. 4: Machine learning methods. Moscow, Kazan, Strasburg, 2020. 321 р. (in Russian).
10. Li S., Fedorowicz A., Singh H., Soderholm S. C. Application of the Random Forest Method in Studies of Local Lymph Node Assay Based Skin Sensitization Data. Journal of Chemical Information and Modeling, 2005, vol. 45, no. 4, pp. 952–964. doi: 10.1021/ci050049u
11. Syroeshkin A. V., Kovaleva A. N., Kandalaft E., Pleteneva T. V. Application of a method of quantitative correlations frame – property with usage of a topological coefficient on an example of group of sulfanilamidums. Vestnik Rossiiskogo universiteta druzhby narodov. Seriya: Meditsina = RUDN Journal of Medicine, 2000, no. 2, pp. 80–83 (in Russian).
12. Liu D., Zhang X., Zheng T., Shi Q., Cui Y., Wang Y., Liu L. Optimisation and evaluation of the random forest model in the efficacy prediction of chemoradiotherapy for advanced cervical cancer based on radiomics signature from high-resolution T2 weighted images. Archives of Gynecology and Obstetrics, 2021, vol. 303, no. 3, pp. 811–820. doi: 10.1007/s00404-020-05908-5
13. Janitza S., Strobl C., Boulesteix A.-L. An AUC-based permutation variable importance measure for random forests. BMC Bioinformatics, 2013, vol. 14, no. 1, pp. 1–11. doi: 10.1186/1471-2105-14-119