Algoritmo K-NN para la identificación de posibles fármacos contra la COVID-19

Raúl Isea

Autores/as

Raúl Isea Fundación Instituto de Estudios Avanzados, Miranda, Venezuela https://orcid.org/0000-0002-6318-3428

Palabras clave:

ChEMBL, clústers, K-NN, quimioinformática, SARS-CoV-2

Resumen

El objetivo de la investigación explorar y validar la aplicación del algoritmo K-NN para la identificación de grupos de compuestos que pueden ser empleadas contra la COVID-19 mediante métodos de quimioinformática. Para lograrlo, se analizaron los componentes de la base de datos ChEMBL empleados en estudios experimentales sobre el SARS-CoV-2. Esta información fue analizada de forma manual y, finalmente, se obtuvieron 1904 biomoléculas categorizadas como “Activas” o “Inactivas” en función de su actividad inhibitoria frente a dicho virus. Después, se empleó un algoritmo de K-vecinos más cercano (K-NN) para agrupar las biomoléculas en función de su similitud fisicoquímica. Finalmente, el estudio evidenció que este tipo de algoritmos es una herramienta valiosa para identificar posibles compuestos iniciales para posteriores investigaciones que ayuden a combatir la COVID-19, estableciendo de esta manera una base metodológica para futuros trabajos en el presente tema.

Descargas

Los datos de descarga aún no están disponibles.

Referencias

Alie, M., Negesse, Y., Kindie, K., y Merawi, D. (2024). Machine learning algorithms for predicting COVID-19 mortality in Ethiopia. BMC Public Health, 24 (1), 1728. https://doi.org/10.1186/s12889-024-19196-0

Ávila, J., Mayer, M., y Quesada, V. (2021). La inteligencia artificial y sus aplicaciones en medicina II: importancia actual y aplicaciones prácticas [Artificial intelligence and its applications in medicine II: Current importance and practical applications]. Atención Primaria, 53 (1), 81-88. https://doi.org/10.1016/j.aprim.2020.04.014

Bajusz, D., Rácz, A., y Héberger, K. (2015). Why is Tanimoto index an appropriate choice for fingerprint-based similarity calculations? Journal of Cheminformatics, 7, 20. https://doi.org/10.1186/s13321-015-0069-3

Bento, A., Hersey, A., Félix, E., Landrum, G., Gaulton, A., Atkinson, F., Bellis, L., De Veij, M., y Leach, A. (2020). An open source chemical structure curation pipeline using RDKit. Journal of Cheminformatics, 12 (1), 51. https://doi.org/10.1186/s13321-020-00456-1

Brown, F. (1998). Chapter 35. Chemoinformatics: What is it and How does it Impact Drug Discovery. Annual Reports in Medicinal Chemistry, 33, 375-384. https://doi.org/10.1016/S0065-7743(08)61100-8

Cornell, A., Kim, S., Cuadros, J., Bucholtz, E., Pence, H., Potenzone, R., y Belford, R. (2024). IUPAC International Chemical Identifier (InChI)-related education and training materials through InChI Open Education Resource (OER). Chemistry Teacher International, 6 (1), 77-91. https://doi.org/10.1515/cti-2023-0009

Cottrell, S., Hozumi, Y., y Wei, G. (2023). K-Nearest-Neighbors Induced Topological PCA for Single Cell RNA-Sequence Data Analysis. ArXiv [Preprint]. https://doi.org/10.48550/arXiv.2310.14521

Davies, M., Nowotka, M., Papadatos, G., Dedman, N., Gaulton, A., Atkinson, F., Bellis, L., y Overington, J. (2015). ChEMBL web services: streamlining access to drug discovery data and utilities. Nucleic Acids Research, 43 (W1), W612-W620. https://doi.org/10.1093/nar/gkv352

De Clercq, E. (2020). Antiviral drugs in development for the treatment of COVID-19. Biochemical Pharmacology, 176, 113747.

Dias, D., Viana, W., De Azevedo, W., y Andricopulo, A. (2021). Deep learning applied to QSAR for the identification of potential anti-SARS-CoV-2 compounds. European Journal of Medicinal Chemistry, 212, 113175.

Ehsani, R., y Drabløs, F. (2020). Robust Distance Measures for kNN Classification of Cancer Data. Cancer Informatics, 19, 1176935120965542. https://doi.org/10.1177/1176935120965542

Ghasemi, S., Saadati, S., Ebrahimiasl, S., y Fassihi, A. (2021). QSAR study of angiotensin-converting enzyme 2 (ACE2) inhibitors as potential therapeutic agents for COVID-19. Journal of Molecular Liquids, 323, 114582.

Hernández, V., Blanquer, I., Aparicio, G., Isea, R., Chaves, J., Hernández, A., Mora, H.,

Fernández, M., Acero, A., Montes, E., y Mayo, R. (2007). Advances in the biomedical applications of the EELA Project. Stud Health Technol Inform. Studies in Health Technology and Informatics, 126, 31-36. https://ebooks.iospress.nl/publication/10828

Hu, H., Stumpfe, D., y Bajorath, J. (2018). Rationalizing the Formation of Activity Cliffs in Different Compound Data Sets. ACS Omega, 3 (7), 7736-7744.6. https://doi.org/10.1021/acsomega.8b01188

Isea, R., Hoebeke, J., y Mayo, R. (2013). Designing a peptide-dendrimer for use as a synthetic vaccine against Plasmodium falciparum 3D7. American Journal of Bioinformatics and Computational Biology, 1 (1), 1.

Isea, R., Mayo, R., y Restrepo, S. (2016). Reverse Vaccinology in Plasmodium falciparum 3D7. Journal of Immunological Techniques & Infectious Diseases, 5 (3), 1. https://doi.org/10.4172/2329-9541.1000145

Khan, M., Shahid, M., Ali, S., Asif, H., y Ashraf, M. (2021). Quantitative structure-activity relationship (QSAR) studies on potential inhibitors of SARS-CoV-2 main protease. Journal of Biomolecular Structure and Dynamics, 39 (16), 5949-5963.

Liu, X., Zhang, R., Jin, M., Zhao, M., Li, J., Wei, S., y Liu, H. (2020). Identification of potential inhibitors against SARS-CoV-2 main protease by QSAR modeling and virtual screening. European Journal of Pharmaceutical Sciences, 152, 105454.

Maggiora, G., Vogt, M., Stumpfe, D., y Bajorath, J. (2014). Molecular similarity in medicinal chemistry. Journal of Medicinal Chemistry, 57 (8), 3186-3204. https://doi.org/10.1021/jm401411z

Nowotka, M., Gaulton, A., Mendez, D., Bento, A., Hersey, A., y Leach, A. (2017). Using ChEMBL web services for building applications and data processing workflows relevant to drug discovery. Expert Opinion on Drug Discovery, 12 (8), 757-767. https://doi.org/10.1080/17460441.2017.1339032

Ojha, S., Roy, K., y Mitra, I. (2021). Machine learning-based QSAR modeling for the prediction of SARS-CoV-2 main protease inhibitors. Journal of Molecular Graphics and Modelling, 107, 107939.

Rabie, A., Mohamed, A., Abo-Elsoud, M., y Saleh, A. (2023). A new Covid-19 diagnosis strategy using a modified KNN classifier. Neural Computing and Applications, 35 (27), 1-25. https://doi.org/10.1007/s00521-023-08588-9

Rjoub, H., Adebayo, T., y Kirikkaleli, D. (2023). Blockchain technology-based FinTech banking sector involvement using adaptive neuro-fuzzy-based K-nearest neighbors algorithm. Financial Innovation, 9 (1), 65. https://doi.org/10.1186/s40854-023-00469-3

Sejuti, Z., y Islam, M. (2023). A hybrid CNN-KNN approach for identification of COVID-19 with 5-fold cross validation. Sensors International, 10 (4), 100229. https://doi.org/10.1186/s12889-024-19196-0

Tian, Y., Tong, J., Liu, Y., y Tian, Y. (2024). QSAR Study, Molecular Docking and Molecular Dynamic Simulation of Aurora Kinase Inhibitors Derived from Imidazo[4,5-b]pyridine Derivatives. Molecules, 29 (8), 1772. https://doi.org/10.3390/molecules29081772

Woodhouse, A., Hobbes, A., Mather, L., y Gibson, M. (1996). A comparison of morphine,

pethidine and fentanyl in the postsurgical patient-controlled analgesia environment.

Pain, 64 (1), 115-121. https://doi.org/10.1016/0304-3959(95)00082-8

Zdrazil, B. (2025). Fifteen years of ChEMBL and its role in cheminformatics and drug discovery.

Journal of Cheminformatics, 17 (1), 32. https://doi.org/10.1186/s13321-025-00963-z

Zdrazil, B., Felix, E., Hunter, F., Manners, E., Blackshaw, J., Corbett, S., De Veij, M., Ioannidis, H., Lopez, D., Mosquera, J., Magarinos, M., Bosc, N., Arcila, R., Kizilören, T., Gaulton, A., Bento, A., Adasme, M., Monecke, P., Landrum, G., y Leach, A. (2024). The ChEMBL Database in 2023: a drug discovery platform spanning multiple bioactivity data types and time periods. Nucleic Acids Research, 52 (D1), D1180-D1192. https://doi.org/10.1093/nar/gkad1004

Zhang, Z. (2016). Introduction to machine learning: k-nearest neighbors. Annals of Translational Medicinal, 4 (11), 218. https://doi.org/10.21037/atm.2016.03.37