Classification of COVID-19 associated symptomatology using machine learning

Julian Andres Ramirez-Bautista; Silvia L. Chaparro-Cárdenas; Wilson Gamboa-Contreras; William Guerrero-Salazar; Jorge Adalberto Huerta-Ruelas

doi:10.15446/dyna.v90n226.105616

Publicado

2023-05-25

Classification of COVID-19 associated symptomatology using machine learning

Clasificación de la sintomatología asociada a la COVID-19 mediante aprendizaje automático

DOI:

https://doi.org/10.15446/dyna.v90n226.105616

Palabras clave:

computer-aided diagnosis: COVID-19; disease diagnosis; machine learning; artificial neural networks (en)
diagnóstico asistido por ordenador; COVID-19; diagnóstico de enfermedades; aprendizaje automático; redes neuronales artificiales (es)

Descargas

Autores/as

Julian Andres Ramirez-Bautista Departamento de Investigación, Fundación Universitaria de San Gil-Unisangil, San Gil, Colombia https://orcid.org/0000-0002-6472-5751
Silvia L. Chaparro-Cárdenas Departamento de Investigación, Fundación Universitaria de San Gil-Unisangil, San Gil, Colombia https://orcid.org/0000-0002-2589-259X
Wilson Gamboa-Contreras Departamento de Investigación, Fundación Universitaria de San Gil-Unisangil, San Gil, Colombia https://orcid.org/0000-0001-5526-3156
William Guerrero-Salazar Departamento de Investigación, Fundación Universitaria de San Gil-Unisangil, San Gil, Colombia https://orcid.org/0000-0002-2441-5441
Jorge Adalberto Huerta-Ruelas Centro de Investigación en Ciencia Aplicada y Tecnología Avanzada-Instituto Politécnico Nacional, Querétaro, México https://orcid.org/0000-0001-5632-3368

Resumen (en)
Resumen (es)

The health situation caused by the SARS-Cov2 coronavirus, posed major challenges for the scientific community. Advances in artificial intelligence are a very useful resource, but it is important to determine which symptoms presented by positive cases of infection are the best predictors. A machine learning approach was used with data from 5,434 people, with eleven symptoms: breathing problems, dry cough, sore throat, running nose, history of asthma, chronic lung, headache, heart disease, hypertension, diabetes, and fever. Based on public data from Kaggle with WHO standardized symptoms. A model was developed to detect COVID-19 positive cases using a simple machine learning model. The results of 4 loss functions and by SHAP values, were compared. The best loss function was Binary Cross Entropy, with a single hidden layer configuration with 10 neurons, achieving an F1 score of 0.98 and the model was rated with an area under the curve of 0.99 aucROC.

La situación sanitaria provocada por el coronavirus SARS-Cov2 plantea grandes retos a la comunidad científica. Los avances en inteligencia artificial son un recurso muy útil, pero es importante determinar qué síntomas presentados por los casos positivos de infección son los mejores predictores. Se utilizó un enfoque de aprendizaje automático con datos de 5.434 personas, con once síntomas: problemas respiratorios, tos seca, dolor de garganta, secreción nasal, antecedentes de asma, pulmón crónico, dolor de cabeza, enfermedad cardíaca, hipertensión, diabetes y fiebre. Basado en datos públicos de Kaggle con síntomas estandarizados por la OMS. Se desarrolló un modelo para detectar los casos positivos de COVID-19 utilizando un modelo simple de aprendizaje automático. Se compararon los resultados de 4 funciones de pérdida y por valores SHAP. La mejor función de pérdida fue la Entropía Cruzada Binaria, con una configuración de una sola capa oculta con 10 neuronas, logrando una puntuación F1 de 0,98 y el modelo fue calificado con un área bajo la curva de 0,99 aucROC.

Referencias

Peña-Reyes, C. A. and Sipper, M., Evolutionary Computation in medicine: an overview, Artif. Intell. Med., 19(1), pp. 1-23, 2000, DOI: https://doi.org/10.1016/S0933-3657(99)00047-0.

Tan, K.C., Yu, Q.C.. Heng, M., and Lee, T.H., Evolutionary computing for knowledge discovery in medical diagnosis, Artif. Intell. Med., 27(2), pp. 129-154, 2003, DOI: https://doi.org/10.1016/S0933-3657(03)00002-2.

Li, Z., Chen, W., Wang, J. and Liu, J., An automatic recognition system for patients with movement disorders based on wearable sensors, in: Proc. 9th IEEE Conf. Ind. Electron. Appl. ICIEA 2014, pp. 1948-1953, 2014. DOI: https://doi.org/10.1109/ICIEA.2014.6931487.

Andrikopoulou, M. et al., Symptoms and critical illness among obstetric patients with coronavirus disease 2019 (COVID-19) infection, Obstet. Gynecol., 136(2), pp. 291-299, 2020. DOI: https://doi.org/10.1097/AOG.0000000000003996.

Amenta, E.M., Spallone, A., Rodriguez-Barradas, M.C., El--Sahly, H.M., Atmar, R.L., and Kulkarni, P.A., Postacute COVID-19: an overview and approach to classification, Open Forum Infect. Dis., 7(12), pp. 1-7, 2020. DOI: https://doi.org/10.1093/ofid/ofaa509.

Maghdid, H.S., Ghafoor, K.Z., Sadiq, A.S., Curran, K., Rawat, D.B., and Rabie, K., A novel AI-enabled framework to diagnose coronavirus COVID-19 using smartphone embedded sensors: design study, arXiv, pp. 1-7, 2020, DOI: https://doi.org/10.48550/arXiv.2003.07434

Alimadadi, A., Aryal, S., Manandhar, I., Munroe, P.B., Joe, B., and Cheng, X., Artificial intelligence and machine learning to fight Covid-19, Physiol. Genomics, 52(4), pp. 200-202, 2020. DOI: https://doi.org/10.1152/physiolgenomics.00029.2020.

Zoabi, Y., and Shomron, N., COVID-19 diagnosis prediction by symptoms of tested individuals : a machine learning approach, npj Digital Medicine, May, art. 93948, 2020. DOI: https://doi.org/10.1101/2020.05.07.20093948.

Alafif, T. and Bajaba, S., Machine and deep learning towards COVID-19 diagnosis and treatment: survey, Challenges, November, art. 47848, 2020, DOI: https://doi.org/10.13140/RG.2.2.20805.47848/1.

Zoabi, Y., Deri-Rozov, S. and Shomron, N., Machine learning-based prediction of COVID-19 diagnosis based on symptoms. npj Digit. Med. 4(1), 2021. https://doi.org/10.1038/s41746-020-00372-6.

Chen, Y. et al., An interpretable machine learning framework for accurate severe vs non-severe COVID-19 clinical type classification, medRxiv, 2020. DOI: https://doi.org/10.1101/2020.05.18.20105841.

Ahamad, M.M. et al., A machine learning model to identify early stage symptoms of SARS-Cov-2 infected patients, Expert Syst. Appl., 160, art. 113661, 2020. DOI: https://doi.org/10.1016/j.eswa.2020.113661.

Khanday, A.M.U.D., Rabani, S.T., Khan, Q.R., Rouf, N., and Mohi Ud Din, M., Machine learning based approaches for detecting COVID-19 using clinical text data, Int. J. Inf. Technol., 12(3), pp. 731-739, 2020. DOI: https://doi.org/10.1007/s41870-020-00495-9.

Smarr, B.L. et al., Feasibility of continuous fever monitoring using wearable devices, Sci. Rep., 10(1), art. 21640, 2020. DOI: https://doi.org/10.1038/s41598-020-78355-6.

Usha-Ruby, A., Theerthagiri, P., Jeena-Jacob, I., and Vamsidhar, Y., Binary cross entropy with deep learning technique for image classification, Int. J. Adv. Trends Comput. Sci. Eng., 9(4), pp. 5393-5397, 2020. DOI: https://doi.org/10.30534/ijatcse/2020/175942020.

Valencia, A.M., Construcción de la distribución de pérdidas y el problema de agregación de riesgo operativo bajo modelos LDA: una revisión, Revista Ingenierías Universidad de Medellín, 12(23), pp. 71-82, 2013. DOI: https://doi.org/10.22395/rium.v12n23a6

Wang, Z. and Bovik, A.C., Mean squared error: Love it or leave it?. A new look at signal fidelity measures, IEEE Signal Process. Mag., 6(1), pp. 98-117, 2009, DOI: https://doi.org/10.1109/MSP.2008.930649.

Meyer, G.P., An alternative probabilistic interpretation of the huber loss, arXiv:1911.02088v3, Section 2, pp. 5261-5269, 2019, DOI: https://doi.org/10.48550/arXiv.1911.02088

Lundberg, S. and Lee, S.-I., A Unified approach to interpreting model predictions, Adv. Neural Inf. Process. Syst., 2017, pp. 4766-4775, 2017.

Mangalathu, S., Hwang, S.H. and Jeo, J.S., Failure mode and effects analysis of RC members based on machine-learning-based SHapley Additive exPlanations (SHAP) approach, Eng. Struct., 219, art. 110927, 2020. DOI: https://doi.org/10.1016/j.engstruct.2020.110927.

Štrumbelj, E. and Kononenko, I., Explaining prediction models and individual predictions with feature contributions, Knowl. Inf. Syst., 41(3), pp. 647-665, 2014. DOI: https://doi.org/10.1007/S10115-013-0679-X.

Cómo citar

IEEE

[1]

J. A. Ramirez-Bautista, S. L. Chaparro-Cárdenas, W. Gamboa-Contreras, W. Guerrero-Salazar, y J. A. Huerta-Ruelas, «Classification of COVID-19 associated symptomatology using machine learning», DYNA, vol. 90, n.º 226, pp. 36–43, may 2023.

ACM

[1]

Ramirez-Bautista, J.A., Chaparro-Cárdenas , S.L., Gamboa-Contreras, W., Guerrero-Salazar , W. y Huerta-Ruelas, J.A. 2023. Classification of COVID-19 associated symptomatology using machine learning. DYNA. 90, 226 (may 2023), 36–43. DOI:https://doi.org/10.15446/dyna.v90n226.105616.

ACS

(1)

Ramirez-Bautista, J. A.; Chaparro-Cárdenas , S. L.; Gamboa-Contreras, W.; Guerrero-Salazar , W.; Huerta-Ruelas, J. A. Classification of COVID-19 associated symptomatology using machine learning. DYNA 2023, 90, 36-43.

APA

Ramirez-Bautista, J. A., Chaparro-Cárdenas , S. L., Gamboa-Contreras, W., Guerrero-Salazar , W. & Huerta-Ruelas, J. A. (2023). Classification of COVID-19 associated symptomatology using machine learning. DYNA, 90(226), 36–43. https://doi.org/10.15446/dyna.v90n226.105616

ABNT

RAMIREZ-BAUTISTA, J. A.; CHAPARRO-CÁRDENAS , S. L.; GAMBOA-CONTRERAS, W.; GUERRERO-SALAZAR , W.; HUERTA-RUELAS, J. A. Classification of COVID-19 associated symptomatology using machine learning. DYNA, [S. l.], v. 90, n. 226, p. 36–43, 2023. DOI: 10.15446/dyna.v90n226.105616. Disponível em: https://revistas.unal.edu.co/index.php/dyna/article/view/105616. Acesso em: 20 mar. 2026.

Chicago

Ramirez-Bautista, Julian Andres, Silvia L. Chaparro-Cárdenas, Wilson Gamboa-Contreras, William Guerrero-Salazar, y Jorge Adalberto Huerta-Ruelas. 2023. «Classification of COVID-19 associated symptomatology using machine learning». DYNA 90 (226):36-43. https://doi.org/10.15446/dyna.v90n226.105616.

Harvard

Ramirez-Bautista, J. A., Chaparro-Cárdenas , S. L., Gamboa-Contreras, W., Guerrero-Salazar , W. y Huerta-Ruelas, J. A. (2023) «Classification of COVID-19 associated symptomatology using machine learning», DYNA, 90(226), pp. 36–43. doi: 10.15446/dyna.v90n226.105616.

MLA

Ramirez-Bautista, J. A., S. L. Chaparro-Cárdenas, W. Gamboa-Contreras, W. Guerrero-Salazar, y J. A. Huerta-Ruelas. «Classification of COVID-19 associated symptomatology using machine learning». DYNA, vol. 90, n.º 226, mayo de 2023, pp. 36-43, doi:10.15446/dyna.v90n226.105616.

Turabian

Ramirez-Bautista, Julian Andres, Silvia L. Chaparro-Cárdenas, Wilson Gamboa-Contreras, William Guerrero-Salazar, y Jorge Adalberto Huerta-Ruelas. «Classification of COVID-19 associated symptomatology using machine learning». DYNA 90, no. 226 (mayo 25, 2023): 36–43. Accedido marzo 20, 2026. https://revistas.unal.edu.co/index.php/dyna/article/view/105616.

Vancouver

1.

Ramirez-Bautista JA, Chaparro-Cárdenas SL, Gamboa-Contreras W, Guerrero-Salazar W, Huerta-Ruelas JA. Classification of COVID-19 associated symptomatology using machine learning. DYNA [Internet]. 25 de mayo de 2023 [citado 20 de marzo de 2026];90(226):36-43. Disponible en: https://revistas.unal.edu.co/index.php/dyna/article/view/105616

Descargar cita

CrossRef Cited-by

0

Dimensions

PlumX

Visitas a la página del resumen del artículo

450

Descargas

Los datos de descargas todavía no están disponibles.

Licencia

Derechos de autor 2023 DYNA

Esta obra está bajo una licencia internacional Creative Commons Atribución-NoComercial-SinDerivadas 4.0.

El autor o autores de un artículo aceptado para publicación en cualquiera de las revistas editadas por la facultad de Minas cederán la totalidad de los derechos patrimoniales a la Universidad Nacional de Colombia de manera gratuita, dentro de los cuáles se incluyen: el derecho a editar, publicar, reproducir y distribuir tanto en medios impresos como digitales, además de incluir en artículo en índices internacionales y/o bases de datos, de igual manera, se faculta a la editorial para utilizar las imágenes, tablas y/o cualquier material gráfico presentado en el artículo para el diseño de carátulas o posters de la misma revista.