<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE article
  PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.1 20151215//EN" "https://jats.nlm.nih.gov/publishing/1.1/JATS-journalpublishing1.dtd">
<article article-type="research-article" dtd-version="1.1" specific-use="sps-1.9" xml:lang="en" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">
	<front>
		<journal-meta>
			<journal-id journal-id-type="publisher-id">dyna</journal-id>
			<journal-title-group>
				<journal-title>DYNA</journal-title>
				<abbrev-journal-title abbrev-type="publisher">Dyna rev.fac.nac.minas</abbrev-journal-title>
			</journal-title-group>
			<issn pub-type="ppub">0012-7353</issn>
			<issn pub-type="epub">2346-2183</issn>
			<publisher>
				<publisher-name>Universidad Nacional de Colombia</publisher-name>
			</publisher>
		</journal-meta>
		<article-meta>
			<article-id pub-id-type="doi">10.15446/dyna.v90n226.105616</article-id>
			<article-categories>
				<subj-group subj-group-type="heading">
					<subject>Article</subject>
				</subj-group>
			</article-categories>
			<title-group>
				<article-title>Classification of COVID-19 associated symptomatology using machine learning</article-title>
				<trans-title-group xml:lang="es">
					<trans-title>Clasificación de la sintomatología asociada a la COVID-19 mediante aprendizaje automático</trans-title>
				</trans-title-group>
			</title-group>
			<contrib-group>
				<contrib contrib-type="author">
					<contrib-id contrib-id-type="orcid">0000-0002-6472-5751</contrib-id>
					<name>
						<surname>Ramirez-Bautista</surname>
						<given-names>Julian Andres</given-names>
					</name>
					<xref ref-type="aff" rid="aff1"><sup>a</sup></xref>
				</contrib>
				<contrib contrib-type="author">
					<contrib-id contrib-id-type="orcid">0000-0002-2589-259X</contrib-id>
					<name>
						<surname>Chaparro-Cárdenas</surname>
						<given-names>Silvia L.</given-names>
					</name>
					<xref ref-type="aff" rid="aff1"><sup>a</sup></xref>
				</contrib>
				<contrib contrib-type="author">
					<contrib-id contrib-id-type="orcid">0000-0001-5526-3156</contrib-id>
					<name>
						<surname>Gamboa-Contreras</surname>
						<given-names>Wilson</given-names>
					</name>
					<xref ref-type="aff" rid="aff1"><sup>a</sup></xref>
				</contrib>
				<contrib contrib-type="author">
					<contrib-id contrib-id-type="orcid">0000-0002-2441-5441</contrib-id>
					<name>
						<surname>Guerrero-Salazar</surname>
						<given-names>William</given-names>
					</name>
					<xref ref-type="aff" rid="aff1"><sup>a</sup></xref>
				</contrib>
				<contrib contrib-type="author">
					<contrib-id contrib-id-type="orcid">0000-0001-5632-3368</contrib-id>
					<name>
						<surname>Huerta-Ruelas</surname>
						<given-names>Jorge Adalberto</given-names>
					</name>
					<xref ref-type="aff" rid="aff2"><sup>b</sup></xref>
				</contrib>
			</contrib-group>
			<aff id="aff1">
				<label>a</label>
				<institution content-type="original"> Departamento de Investigación, Fundación Universitaria de San Gil-Unisangil, San Gil, Colombia. jramirez@unisangil.edu.co, schaparro@unisangil.edu.co, wgamboa@unisangil.edu.co, wguerrero@unisangil.edu.co</institution>
				<institution content-type="normalized">Fundación Universitaria de San Gil</institution>
				<institution content-type="orgdiv1">Departamento de Investigación</institution>
				<institution content-type="orgname">Fundación Universitaria de San Gil-Unisangil</institution>
				<addr-line>
					<city>San Gil</city>
				</addr-line>
				<country country="CO">Colombia</country>
				<email>jramirez@unisangil.edu.co</email>
				<email>schaparro@unisangil.edu.co</email>
				<email>wgamboa@unisangil.edu.co</email>
				<email>wguerrero@unisangil.edu.co</email>
			</aff>
			<aff id="aff2">
				<label>b</label>
				<institution content-type="original"> Centro de Investigación en Ciencia Aplicada y Tecnología Avanzada-Instituto Politécnico Nacional, Querétaro, México. jhuertar@ipn.mx</institution>
				<institution content-type="orgname">Centro de Investigación en Ciencia Aplicada y Tecnología Avanzada-Instituto Politécnico Nacional</institution>
				<addr-line>
					<city>Querétaro</city>
				</addr-line>
				<country country="MX">México</country>
				<email>jhuertar@ipn.mx</email>
			</aff>
			<pub-date date-type="pub" publication-format="electronic">
				<day>12</day>
				<month>02</month>
				<year>2024</year>
			</pub-date>
			<pub-date date-type="collection" publication-format="electronic">
				<season>Apr-Jun</season>
				<year>2023</year>
			</pub-date>
			<volume>90</volume>
			<issue>226</issue>
			<fpage>36</fpage>
			<lpage>43</lpage>
			<history>
				<date date-type="received">
					<day>02</day>
					<month>11</month>
					<year>2022</year>
				</date>
				<date date-type="rev-recd">
					<day>04</day>
					<month>04</month>
					<year>2023</year>
				</date>
				<date date-type="accepted">
					<day>09</day>
					<month>04</month>
					<year>2023</year>
				</date>
			</history>
			<permissions>
				<license license-type="open-access" xlink:href="https://creativecommons.org/licenses/by-nc-nd/4.0/" xml:lang="en">
					<license-p>This is an open-access article distributed under the terms of the Creative Commons Attribution License</license-p>
				</license>
			</permissions>
			<abstract>
				<title>Abstract</title>
				<p>The health situation caused by the SARS-Cov2 coronavirus, posed major challenges for the scientific community. Advances in artificial intelligence are a very useful resource, but it is important to determine which symptoms presented by positive cases of infection are the best predictors. A machine learning approach was used with data from 5,434 people, with eleven symptoms: breathing problems, dry cough, sore throat, running nose, history of asthma, chronic lung, headache, heart disease, hypertension, diabetes, and fever. Based on public data from Kaggle with WHO standardized symptoms. A model was developed to detect COVID-19 positive cases using a simple machine learning model. The results of 4 loss functions and by SHAP values, were compared. The best loss function was Binary Cross Entropy, with a single hidden layer configuration with 10 neurons, achieving an F1 score of 0.98 and the model was rated with an area under the curve of 0.99 aucROC.</p>
			</abstract>
			<trans-abstract xml:lang="es">
				<title>Resumen</title>
				<p>La situación sanitaria provocada por el coronavirus SARS-Cov2 plantea grandes retos a la comunidad científica. Los avances en inteligencia artificial son un recurso muy útil, pero es importante determinar qué síntomas presentados por los casos positivos de infección son los mejores predictores. Se utilizó un enfoque de aprendizaje automático con datos de 5.434 personas, con once síntomas: problemas respiratorios, tos seca, dolor de garganta, secreción nasal, antecedentes de asma, pulmón crónico, dolor de cabeza, enfermedad cardíaca, hipertensión, diabetes y fiebre. Basado en datos públicos de Kaggle con síntomas estandarizados por la OMS. Se desarrolló un modelo para detectar los casos positivos de COVID-19 utilizando un modelo simple de aprendizaje automático. Se compararon los resultados de 4 funciones de pérdida y por valores SHAP. La mejor función de pérdida fue la Entropía Cruzada Binaria, con una configuración de una sola capa oculta con 10 neuronas, logrando una puntuación F1 de 0,98 y el modelo fue calificado con un área bajo la curva de 0,99 aucROC. </p>
			</trans-abstract>
			<kwd-group xml:lang="en">
				<title>Keywords:</title>
				<kwd>computer-aided diagnosis: COVID-19</kwd>
				<kwd>disease diagnosis</kwd>
				<kwd>machine learning</kwd>
				<kwd>artificial neural networks</kwd>
			</kwd-group>
			<kwd-group xml:lang="es">
				<title>Palabras clave:</title>
				<kwd>diagnóstico asistido por ordenador</kwd>
				<kwd>COVID-19</kwd>
				<kwd>diagnóstico de enfermedades</kwd>
				<kwd>aprendizaje automático</kwd>
				<kwd>redes neuronales artificiales</kwd>
			</kwd-group>
			<counts>
				<fig-count count="5"/>
				<table-count count="10"/>
				<equation-count count="5"/>
				<ref-count count="21"/>
				<page-count count="8"/>
			</counts>
		</article-meta>
	</front>
	<body>
		<sec sec-type="intro">
			<title>1. Introduction</title>
			<p>Electronic medical data are ubiquitous and available in large quantities with high accuracy due to the increasing availability of such data from a variety of sources, including clinical institutions, individual patients, insurance companies, pharmaceutical industries, and others, highlighting great opportunities for understanding risk factors, and disease spread, continuous health monitoring, among others; enabling targeted prevention approaches [<xref ref-type="bibr" rid="B1">1</xref>]. Advances in computer science - especially in machine learning with new and improved methods for data collection and storage - have shown increasing interest related to healthcare data analysis [<xref ref-type="bibr" rid="B2">2</xref>,<xref ref-type="bibr" rid="B3">3</xref>].</p>
			<p>The emergence and spread of the SARS-Cov2 coronavirus, which produces the disease called COVID-19 [<xref ref-type="bibr" rid="B4">4</xref>,<xref ref-type="bibr" rid="B5">5</xref>] has become a particular challenge for healthcare professionals and the general population [<xref ref-type="bibr" rid="B6">6</xref>]. The disease placed a great burden on healthcare systems, and total confinements generated losses, as many of the people were not infected, but the follow-up of positive cases becomes a complex task [<xref ref-type="bibr" rid="B7">7</xref>]. </p>
			<p>Many studies have focused on identifying infected individuals to isolate them and allow non-infected individuals to work regularly. The use of clinical symptoms is essential to optimize the identification of infected individuals. </p>
			<p>In this sense, researchers have developed predictive models that combine several features such as clinical symptoms, laboratory tests, among others. Also, models to detect possible contagion and estimate the risk of infection, and classify the population to help medical personnel and countries' economies [<xref ref-type="bibr" rid="B8">8</xref>,<xref ref-type="bibr" rid="B9">9</xref>]. References [<xref ref-type="bibr" rid="B8">8</xref>,<xref ref-type="bibr" rid="B10">10</xref>] predict a SARS-CoV-2 infection by asking 8 basic questions of which 5 refer to symptomatology (fever, cough, sore throat, shortness of breath, and headache) obtaining an accuracy of 90%. Chen et al. explored the distributions of comorbidities and symptoms, in addition to laboratory test results, to correlate between non-severe and severe types of COVID-19, they were able to identify key features between both clinical types using Machine Learning, as an accurate diagnostic decision support tool [<xref ref-type="bibr" rid="B11">11</xref>]. Ahamad et al. developed a model using supervised machine learning algorithms to identify features that predict the diagnosis of COVID-19. Using an algorithm called XGBoost, they obtained an accuracy of over 85% in predicting and selecting features that correctly indicate COVID-19 status, indicating that the most frequent and significant predictive symptoms are fever (41.1%), cough (30.3%), lung infection (13.1%) and nasal discharge (8.43%) [<xref ref-type="bibr" rid="B12">12</xref>]. Using a dataset with similar features, other models predicting COVID-19 disease with an area under the curve of 0.90 auROC have been reported using a gradient boosting machine built with decision tree base learners as algorithm [<xref ref-type="bibr" rid="B8">8</xref>]. Another approach is reported by Khanday et al. where they perform experiments with various algorithms such as random forest, stochastic gradient boosting, decision trees, to classify into four classes, COVID, SARS, ARDS and both (COVID, ARDS), 212 labeled clinical reports, obtaining that logistic regression and Naıve Bayesian multinomial classifier give excellent results by having an accuracy of 94% and 96.2% [<xref ref-type="bibr" rid="B13">13</xref>].</p>
			<p>Clinically, COVID-19 disease is complex and manifests itself through a limited number of symptoms such as fever, cough, intense headache, among others [<xref ref-type="bibr" rid="B8">8</xref>,<xref ref-type="bibr" rid="B11">11</xref>]. If these parameters are analyzed with systems based on machine learning algorithms, it is possible to fight this virus and other future viruses by continuously monitoring individuals to improve detection, isolation, and provide disease control recommendations [<xref ref-type="bibr" rid="B14">14</xref>]. </p>
			<p>The study presents the use of machine learning algorithms for COVID-19 detection using the symptoms and physical conditions of 5,434 people with and without the disease. A Keras-Tensor Flow Neural Network was used. The results of four loss functions, based on the performance indicators, F1 score, and area under the curve, were compared to learn the behavior of the neural network as a basis for integrate them into systems that allow preliminary detection of the disease. </p>
			<p>The highest F1 classification score using 11 features was 0.98 with an area under the curve of 0.99 aucROC, using the results of the SHAP values, the less relevant features considered by the model were eliminated, obtaining a feature simplification of 63%, varying the performance by 2 to 3% depending on the metric referred, demonstrating the usefulness of knowing the importance of the features within a classification model, for its simplification without affecting the performance.</p>
		</sec>
		<sec sec-type="materials|methods">
			<title>2. Materials and methods</title>
			<sec>
				<title>2.1. Study data</title>
				<p>The dataset used contains the records of 5,434 people, obtained from the Kaggle platform for experimenting with machine learning models. From these data, a model is developed that predicts COVID-19 scores using five binary features: asthma, chronic lung, heart disease, diabetes, and hyper tension; and six initial clinical symptoms: Breathing problem, fever, dry cough, sore throat, runny nose, and headache. </p>
				<p>The training-validation set consisted of records from 1,051 individuals without the disease and 4,383 individuals with the disease. The following table describes each of the features of the dataset used by the model where the greatest number of individuals present dry cough and fever, respectively (<xref ref-type="table" rid="t1">Table 1</xref>). </p>
				<p>
					<table-wrap id="t1">
						<label>Table 1</label>
						<caption>
							<title>Features of the data set used by the model in this study</title>
						</caption>
						<graphic xlink:href="2346-2183-dyna-90-226-36-gt1.jpg"/>
						<table-wrap-foot>
							<fn id="TFN1">
								<p>Source: The Authors</p>
							</fn>
						</table-wrap-foot>
					</table-wrap>
				</p>
				<p>After an exploratory analysis, no null data or empty cells were found. The figures show the number of positive and negative cases for each characteristic for each class. <xref ref-type="fig" rid="f1">Fig. 1</xref>(a) shows a large number of people presenting cough, fever, sore throat and breathing problem for the COVID-19 positive class, the other features show a balance between presence and non-presence. </p>
				<p>On the other hand, the negative COVID-19 class only shows a considerable imbalance in people with breathing problem and sore throat (<xref ref-type="fig" rid="f1">Fig. 1</xref>(b)). </p>
				<p>
					<fig id="f1">
						<label>Figure 1</label>
						<caption>
							<title>Features distribution grouped by class. a) COVID-19 positive. b) COVID-19 negative.</title>
						</caption>
						<graphic xlink:href="2346-2183-dyna-90-226-36-gf1.jpg"/>
						<attrib>Source: The Authors</attrib>
					</fig>
				</p>
				<p>Considering this is an experimental stage of algorithms testing, the UNISANGIL ethics committee determined that the public health dataset used in this study does not require approval for analysis. The development of systems for constant monitoring of physiological parameters supports public health efforts for the monitoring and control of communicable and no communicable diseases.</p>
			</sec>
			<sec>
				<title>2.2. Experiment setup and design</title>
				<p>Dataset was classified into two groups: COVID-19 positive and COVID-19 negative. This gives a typical binary classification to know whether people were infected or not. To evaluate the performance of the approach used, the training and test data set were divided into 80% and 20%, respectively. (<xref ref-type="table" rid="t2">Table 2</xref>).</p>
				<p>The process of experimentation with the algorithms was based on Keras Python 3.6 under a win10 operating system. The hardware used for the experiments had an i5-7300HQ CPU, 8G RAM and NVIDIA GeForce GTX 1050 GPU.</p>
				<p>
					<table-wrap id="t2">
						<label>Table 2</label>
						<caption>
							<title>Sample Dataset</title>
						</caption>
						<graphic xlink:href="2346-2183-dyna-90-226-36-gt2.jpg"/>
						<table-wrap-foot>
							<fn id="TFN2">
								<p>Source: The Authors</p>
							</fn>
						</table-wrap-foot>
					</table-wrap>
				</p>
				<p>The Keras Dense class was used, as a building block of a fully connected layered model, with different loss function described as below:</p>
				<p>Binary Cross Entropy also known as log loss, is a loss function used in binary classification tasks, which shows the negative mean of the logarithm of the predicted probabilities of each class, penalizing the probabilities as a function of the distance to the expected value, using the given mathematical formulation [<xref ref-type="bibr" rid="B15">15</xref>] (1).</p>
				<p>
					<disp-formula id="e1">
						<graphic xlink:href="2346-2183-dyna-90-226-36-e1.png"/>
					</disp-formula>
				</p>
				<p>Poisson Loss from the combination of loss frequency and loss severity estimates the loss distribution [<xref ref-type="bibr" rid="B16">16</xref>], taking the form of (2).</p>
				<p>
					<disp-formula id="e2">
						<graphic xlink:href="2346-2183-dyna-90-226-36-e2.png"/>
					</disp-formula>
				</p>
				<p>Mean Squared Error is the sum of the squared distances between the true values and the predicted values, greatly penalizing outliers. Due to its sensitivity, regardless of the sign, the values are always positive and 0.0 is the perfect value of the differences [<xref ref-type="bibr" rid="B17">17</xref>]. Mathematically it can be expressed as (3).</p>
				<p>
					<disp-formula id="e3">
						<graphic xlink:href="2346-2183-dyna-90-226-36-e3.png"/>
					</disp-formula>
				</p>
				<p>Huber Loss presents a lower sensitivity to outliers, considering that when the error is large the absolute error is obtained, which becomes quadratic as the error decreases [<xref ref-type="bibr" rid="B18">18</xref>]. Combine the mean square error and the mean absolute error. Its mathematical formulation (4).</p>
				<p>
					<disp-formula id="e4">
						<graphic xlink:href="2346-2183-dyna-90-226-36-e4.png"/>
					</disp-formula>
				</p>
				<p>The tensor flow was used as the backend with 11 input neurons, a variable number of hidden layers, and one neuron in the output layer. The dataset used had 11 input features and a binary output.</p>
				<p>A sequential model with fully connected layers is used, where the dimensions were defined according to the number of inputs, the output, and the variation of the hidden layer, aiming for a small and efficient model. The activation function used was sigmoid. In addition, a comparison with different loss functions was performed. The optimizer used throughout the experimental phase was Adam and the metric used to judge the performance of the neural network was F1 score and auROC parameter from ROC curves. Moreover, considering that the data set is very large, the batch size is used. The characteristics are summarized in <xref ref-type="table" rid="t3">Table 3</xref> below.</p>
				<p>
					<table-wrap id="t3">
						<label>Table 3</label>
						<caption>
							<title>Features of the Model</title>
						</caption>
						<graphic xlink:href="2346-2183-dyna-90-226-36-gt3.png"/>
						<table-wrap-foot>
							<fn id="TFN3">
								<p>Source: The Authors</p>
							</fn>
						</table-wrap-foot>
					</table-wrap>
				</p>
			</sec>
		</sec>
		<sec sec-type="results">
			<title>3. Results</title>
			<p>The model trained with data from 5,434 individuals, 19.30% negative and 80.63% positive for COVID-19 disease. It was validated with a 5-fold cross-validation to ensure that the results are independent of the partition between training and test data (<xref ref-type="table" rid="t4">Tables 4</xref>-<xref ref-type="table" rid="t6">6</xref>).</p>
			<p>
				<table-wrap id="t4">
					<label>Table 4</label>
					<caption>
						<title>Comparative results of the loss functions used in the study with a 5-fold cross-validation, in a model with a hidden layer of 10 neurons</title>
					</caption>
					<graphic xlink:href="2346-2183-dyna-90-226-36-gt4.jpg"/>
					<table-wrap-foot>
						<fn id="TFN4">
							<p>Source: The Authors</p>
						</fn>
					</table-wrap-foot>
				</table-wrap>
			</p>
			<p>
				<table-wrap id="t5">
					<label>Table 5</label>
					<caption>
						<title>Comparative results of the loss functions used in the study with a 5-fold cross-validation, in a model with a hidden layer of 5 neurons</title>
					</caption>
					<graphic xlink:href="2346-2183-dyna-90-226-36-gt5.jpg"/>
					<table-wrap-foot>
						<fn id="TFN5">
							<p>Source: The Authors</p>
						</fn>
					</table-wrap-foot>
				</table-wrap>
			</p>
			<p>
				<table-wrap id="t6">
					<label>Table 6</label>
					<caption>
						<title>Comparative results of the loss functions used in the study with a 5-fold cross-validation, in a model with two hidden layers of 5 neurons each</title>
					</caption>
					<graphic xlink:href="2346-2183-dyna-90-226-36-gt6.jpg"/>
					<table-wrap-foot>
						<fn id="TFN6">
							<p>Source: The Authors</p>
						</fn>
					</table-wrap-foot>
				</table-wrap>
			</p>
			<p>The F1-Score of the model with a hidden layer and 10 neurons show a similar behavior with the four loss functions with a mean of 0.97 and standard deviation of +/-0.005. </p>
			<p>However, the best model was obtained using the Binary Cross Entropy error function achieving a result of 0.98.</p>
			<p>The model was scored on the test set using auROC across different thresholds, including false-positive rate, false-negative rate, and overall accuracy. As seen in <xref ref-type="table" rid="t7">Tables 7</xref> to <xref ref-type="table" rid="t9">9</xref>, the best performance was obtained using the Binary Cross-Entropy loss function with a single hidden layer configuration with 10 neurons.</p>
			<p>Although the behavior of the models was similar considering the F1-Score, the aucROC shows differences of 7 percentage points as shown in <xref ref-type="table" rid="t7">Tables 7</xref> to <xref ref-type="table" rid="t9">9</xref>. The best result obtained, considering this metric, was the one using the Binary Cross Entropy error function with a value of 0.99, regardless of the number of layers and neurons (<xref ref-type="fig" rid="f2">Fig. 2</xref>).</p>
			<p>
				<table-wrap id="t7">
					<label>Table 7</label>
					<caption>
						<title>Comparative results of the auROC of 5-fold cross-validation, in a model with a hidden layer of 10 neurons</title>
					</caption>
					<graphic xlink:href="2346-2183-dyna-90-226-36-gt7.jpg"/>
					<table-wrap-foot>
						<fn id="TFN7">
							<p>Source: The Authors</p>
						</fn>
					</table-wrap-foot>
				</table-wrap>
			</p>
			<p>The metrics of all the ROC curves in this study were calculated using the sklearn.metrics module.</p>
			<p>The SHapley Additive exPlanations (SHAP) method introduced by Lundberg and Lee [<xref ref-type="bibr" rid="B19">19</xref>], whose origins are in game theory, is used to learn the relevant model features. Since it estimates the differences between models with subsets of the feature space, it allows interpreting the predictions of machine learning models using SHAP values, which estimate the contribution of each feature in the model prediction. The SHAP method for interpreting a model uses additive features attribution, where additive features refer to input variables. Thus, it represents the classification result as the sum of the contribution of each feature, as (5).</p>
			<p>
				<disp-formula id="e5">
					<graphic xlink:href="2346-2183-dyna-90-226-36-e5.png"/>
				</disp-formula>
			</p>
			<p>Where g is the explanation model, <italic>z′</italic> is the simplified features vector, <italic>M</italic> is the maximum simplified features size and <italic>ϕj ∈ R</italic> is the feature attribution for a feature <italic>j</italic> [<xref ref-type="bibr" rid="B20">20</xref>,<xref ref-type="bibr" rid="B21">21</xref>].</p>
			<p>Thus, it is obtained that the most important features considered by the model are those summarized in the SHAP graph in <xref ref-type="fig" rid="f3">Fig. 3</xref>. The presence of breathing problem, cough, fever, and sore throat were key predictors of the disease presence. The features that showed a low impact in almost all cases were runny nose, Asthma, diabetes, chronic lung disease, hypertension, and headache.</p>
			<p>
				<table-wrap id="t8">
					<label>Table 8</label>
					<caption>
						<title>Comparative results of the auROC of 5-fold cross-validation, in a model with a hidden layer of 5 neurons</title>
					</caption>
					<graphic xlink:href="2346-2183-dyna-90-226-36-gt8.jpg"/>
					<table-wrap-foot>
						<fn id="TFN8">
							<p>Source: The Authors</p>
						</fn>
					</table-wrap-foot>
				</table-wrap>
			</p>
			<p>
				<table-wrap id="t9">
					<label>Table 9</label>
					<caption>
						<title>Comparative results of the auROC of 5-fold cross-validation, in a model with two hidden layers of 5 neurons each</title>
					</caption>
					<graphic xlink:href="2346-2183-dyna-90-226-36-gt9.jpg"/>
					<table-wrap-foot>
						<fn id="TFN9">
							<p>Source: The Authors</p>
						</fn>
					</table-wrap-foot>
				</table-wrap>
			</p>
			<p>
				<fig id="f2">
					<label>Figure 2</label>
					<caption>
						<title>ROC curves showing the performance of the model using the binary cross-entropy loss function, on the test sets in the 5-fold cross-validation.</title>
					</caption>
					<graphic xlink:href="2346-2183-dyna-90-226-36-gf2.png"/>
					<attrib>Source: The Authors</attrib>
				</fig>
			</p>
			<p>
				<fig id="f3">
					<label>Figure 3</label>
					<caption>
						<title>SHapley Additive exPlanations (SHAP) graph showing the important features considered by the ANN model to predict the diagnosis of COVID-19, in the 5-fold cross-validation.</title>
					</caption>
					<graphic xlink:href="2346-2183-dyna-90-226-36-gf3.png"/>
					<attrib>Source: The Authors</attrib>
				</fig>
			</p>
			<p>
				<table-wrap id="t10">
					<label>Table 10</label>
					<caption>
						<title>Less relevant features considered by the ANN model to predict the diagnosis of COVID-19, in the 5-fold cross-validation</title>
					</caption>
					<graphic xlink:href="2346-2183-dyna-90-226-36-gt10.png"/>
					<table-wrap-foot>
						<fn id="TFN10">
							<p>Source: The Authors</p>
						</fn>
					</table-wrap-foot>
				</table-wrap>
			</p>
			<p>Considering the results of the SHAP values in the 5-fold cross-validation, the least relevant features in each fold are shown in <xref ref-type="table" rid="t10">Table 10</xref>. The least relevant features were running nose and asthma, which in 4 of the 5 folds were common.</p>
			<p>The model is recalculated using the features of the best configuration, but the number of input neurons is modified, leaving those corresponding to running nose and asthma. As a result, by removing the least relevant features, an F1-Score of 0.97+-0.12 and an aucROC of 0.98 are obtained, with variations of 1%. This is a negligible variation considering the elimination of features from the model.</p>
			<p>Finally, the model is obtained again by eliminating the following 3 less relevant features (Diabetes, Chronic Lung and Hypertension), having a total of 6 inputs. The F1-Score result was 0.96 +/-0.34% and the aucROC was 0.96, obtaining a difference of 2 and 3% for each performance index.</p>
			<p>
				<fig id="f4">
					<label>Figure 4</label>
					<caption>
						<title>ROC curve showing the performance of the model using six features</title>
					</caption>
					<graphic xlink:href="2346-2183-dyna-90-226-36-gf4.png"/>
					<attrib>Source: The Authors</attrib>
				</fig>
			</p>
			<p>
				<fig id="f5">
					<label>Figure 5</label>
					<caption>
						<title>SHapley Additive exPlanations (SHAP) graph showing the important features considered by the new ANN model with less features</title>
					</caption>
					<graphic xlink:href="2346-2183-dyna-90-226-36-gf5.png"/>
					<attrib>Source: The Authors</attrib>
				</fig>
			</p>
			<p>Regarding the F1 score obtained in <xref ref-type="table" rid="t4">Tables 4</xref> to <xref ref-type="table" rid="t6">6</xref>, a decrease of approximately 1% is obtained, but considering the ROC curve, the result obtained with the new model that considers fewer features is still better by approximately 5%, as shown in Tables VII to IX. <xref ref-type="fig" rid="f4">Fig. 4</xref> shows the ROC curve of the model using fewer features. On the other hand, <xref ref-type="fig" rid="f5">Fig. 5</xref> shows the SHAP values of the new model, noting that the first two most important features taken by the model do not change in their order or magnitude, although in general the importance of the features was maintained even when some were removed from the initial model.</p>
		</sec>
		<sec sec-type="discussion">
			<title>4. Discussion</title>
			<p>Health monitoring using artificial intelligence techniques is a very active field. Currently, specifically in the disease-causing current pandemic situation, a wide range of approaches have been used for monitoring and evaluation of patients with COVID-19. Promising solutions have been proposed in screening using clinical symptoms as a preliminary step. The study shows the use of preliminary symptomatology and clinical condition of the patient to detect possible COVID-19 using a machine learning algorithm and information from 5,434 people with and without the disease. </p>
			<p>It is shown that feature reduction using techniques such as SHAP values can produce simpler models that use only relevant feature sets to solve a problem. In the case presented, comparison of the model with 11 features, with an F1 score of 0.98 +/-0.08% and an aucROC of 0.99, versus the model with 6 features, with an F1 score of 0.96 +/-0.34% and an aucROC of 0.96, yields a difference of 2 and 3% for each performance index. Obtaining a very small performance variance with a feature reduction of 63%.</p>
			<p>The authors, based on the source of the data, consider that the study is not free of errors and biases, since the clinical condition understood as asthma, chronic lung disease, heart disease, diabetes and hypotension; and six initial clinical symptoms: Respiratory problem, fever, dry cough, sore throat, runny nose and headache of the 5,434 people were taken from public data from the Kaggle platform for experimentation with machine learning models. But it is useful as a basis to evaluate the performance of the machine learning model using different configuration parameters and to learn about the most relevant features considered by the model, showing a path for future studies using proprietary databases acquired for research purposes.</p>
		</sec>
		<sec sec-type="conclusions">
			<title>5. Conclusion and future research</title>
			<p>In this study, we used data from public sources, as an experimental stage. We evaluate different loss functions and configuration parameters of an ANN, to obtain an optimal model that can detect the disease and know which are the most relevant features influencing the detection.</p>
			<p>From the data used, a model has been developed to predict the diagnosis of COVID-19 with an F1 score of 0.98 and 0.99 aucROC, using eleven basic features. As a final model using the SHAP values a model using only 6 features is obtained achieving an F1 score performance of 0.96 and 0.96 aucROC, observing a very small percentage difference and retaining the relevant features for the model.</p>
			<p>The model is intended to benefit the response of health systems to this disease and other respiratory viruses, although the need for more robust data to complement the study and avoid possible biases is emphasized, before algorithm is employed.</p>
		</sec>
	</body>
	<back>
		<ack>
			<title>Acknowledgment</title>
			<p>The authors would like to thank the Fundación Universitaria de San Gil - UNISANGIL, Colombia and the Centro de Investigación en Ciencia Aplicada y Tecnología Avanzada, unit Querétaro from the Instituto Politécnico Nacional, México, for their support for this work.</p>
		</ack>
		<ref-list>
			<title>References</title>
			<ref id="B1">
				<label>[1]</label>
				<mixed-citation>[1] Peña-Reyes, C. A. and Sipper, M., Evolutionary Computation in medicine: an overview, Artif. Intell. Med., 19(1), pp. 1-23, 2000, DOI: <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1016/S0933-3657(99)00047-0.">https://doi.org/10.1016/S0933-3657(99)00047-0.</ext-link>
				</mixed-citation>
				<element-citation publication-type="journal">
					<person-group person-group-type="author">
						<name>
							<surname>Peña-Reyes</surname>
							<given-names>C. A.</given-names>
						</name>
						<name>
							<surname>Sipper</surname>
							<given-names>M.</given-names>
						</name>
					</person-group>
					<article-title>Evolutionary Computation in medicine: an overview,</article-title>
					<source>Artif. Intell. Med</source>
					<volume>19</volume>
					<issue>1</issue>
					<fpage>1</fpage>
					<lpage>23</lpage>
					<year>2000</year>
					<ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1016/S0933-3657(99)00047-0.">https://doi.org/10.1016/S0933-3657(99)00047-0.</ext-link>
				</element-citation>
			</ref>
			<ref id="B2">
				<label>[2]</label>
				<mixed-citation>[2] Tan, K.C., Yu, Q.C.. Heng, M., and Lee, T.H., Evolutionary computing for knowledge discovery in medical diagnosis, Artif. Intell. Med., 27(2), pp. 129-154, 2003, DOI: <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1016/S0933-3657(03)00002-2">https://doi.org/10.1016/S0933-3657(03)00002-2</ext-link>.</mixed-citation>
				<element-citation publication-type="journal">
					<person-group person-group-type="author">
						<name>
							<surname>Tan</surname>
							<given-names>K.C.</given-names>
						</name>
						<name>
							<surname>Yu</surname>
							<given-names>Q.C.</given-names>
						</name>
						<name>
							<surname>Heng</surname>
							<given-names>M.</given-names>
						</name>
						<name>
							<surname>Lee</surname>
							<given-names>T.H.</given-names>
						</name>
					</person-group>
					<article-title>Evolutionary computing for knowledge discovery in medical diagnosis</article-title>
					<source>Artif. Intell. Med</source>
					<volume>27</volume>
					<issue>2</issue>
					<fpage>129</fpage>
					<lpage>154</lpage>
					<year>2003</year>
					<ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1016/S0933-3657(03)00002-2">https://doi.org/10.1016/S0933-3657(03)00002-2</ext-link>
				</element-citation>
			</ref>
			<ref id="B3">
				<label>[3]</label>
				<mixed-citation>[3] Li, Z., Chen, W., Wang, J. and Liu, J., An automatic recognition system for patients with movement disorders based on wearable sensors, in: Proc. 9th IEEE Conf. Ind. Electron. Appl. ICIEA 2014, pp. 1948-1953, 2014. DOI: <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1109/ICIEA.2014.6931487">https://doi.org/10.1109/ICIEA.2014.6931487</ext-link>.</mixed-citation>
				<element-citation publication-type="confproc">
					<person-group person-group-type="author">
						<name>
							<surname>Li</surname>
							<given-names>Z.</given-names>
						</name>
						<name>
							<surname>Chen</surname>
							<given-names>W.</given-names>
						</name>
						<name>
							<surname>Wang</surname>
							<given-names>J.</given-names>
						</name>
						<name>
							<surname>Liu</surname>
							<given-names>J.</given-names>
						</name>
					</person-group>
					<source>An automatic recognition system for patients with movement disorders based on wearable sensors</source>
					<conf-name>9thIEEE Conf. Ind. Electron</conf-name>
					<conf-sponsor>ICIEA</conf-sponsor>
					<conf-date>2014</conf-date>
					<fpage>1948</fpage>
					<lpage>1953</lpage>
					<year>2014</year>
					<ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1109/ICIEA.2014.6931487">https://doi.org/10.1109/ICIEA.2014.6931487</ext-link>
				</element-citation>
			</ref>
			<ref id="B4">
				<label>[4]</label>
				<mixed-citation>[4] Andrikopoulou, M. et al., Symptoms and critical illness among obstetric patients with coronavirus disease 2019 (COVID-19) infection, Obstet. Gynecol., 136(2), pp. 291-299, 2020. DOI: <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1097/AOG.0000000000003996">https://doi.org/10.1097/AOG.0000000000003996</ext-link>.</mixed-citation>
				<element-citation publication-type="journal">
					<person-group person-group-type="author">
						<name>
							<surname>Andrikopoulou</surname>
							<given-names>M.</given-names>
						</name>
						<etal/>
					</person-group>
					<article-title>Symptoms and critical illness among obstetric patients with coronavirus disease 2019 (COVID-19) infection</article-title>
					<source>Obstet. Gynecol</source>
					<volume>136</volume>
					<issue>2</issue>
					<fpage>291</fpage>
					<lpage>299</lpage>
					<year>2020</year>
					<ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1097/AOG.0000000000003996">https://doi.org/10.1097/AOG.0000000000003996</ext-link>
				</element-citation>
			</ref>
			<ref id="B5">
				<label>[5]</label>
				<mixed-citation>[5] Amenta, E.M., Spallone, A., Rodriguez-Barradas, M.C., El--Sahly, H.M., Atmar, R.L., and Kulkarni, P.A., Postacute COVID-19: an overview and approach to classification, Open Forum Infect. Dis., 7(12), pp. 1-7, 2020. DOI: <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1093/ofid/ofaa509">https://doi.org/10.1093/ofid/ofaa509</ext-link>.</mixed-citation>
				<element-citation publication-type="journal">
					<person-group person-group-type="author">
						<name>
							<surname>Amenta</surname>
							<given-names>E.M.</given-names>
						</name>
						<name>
							<surname>Spallone</surname>
							<given-names>A.</given-names>
						</name>
						<name>
							<surname>Rodriguez-Barradas</surname>
							<given-names>M.C.</given-names>
						</name>
						<name>
							<surname>El--Sahly</surname>
							<given-names>H.M.</given-names>
						</name>
						<name>
							<surname>Atmar</surname>
							<given-names>R.L.</given-names>
						</name>
						<name>
							<surname>Kulkarni</surname>
							<given-names>P.A.</given-names>
						</name>
					</person-group>
					<article-title>Postacute COVID-19: an overview and approach to classification</article-title>
					<source>Open Forum Infect. Dis.</source>
					<volume>7</volume>
					<issue>12</issue>
					<fpage>1</fpage>
					<lpage>7</lpage>
					<year>2020</year>
					<ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1093/ofid/ofaa509">https://doi.org/10.1093/ofid/ofaa509</ext-link>
				</element-citation>
			</ref>
			<ref id="B6">
				<label>[6]</label>
				<mixed-citation>[6] Maghdid, H.S., Ghafoor, K.Z., Sadiq, A.S., Curran, K., Rawat, D.B., and Rabie, K., A novel AI-enabled framework to diagnose coronavirus COVID-19 using smartphone embedded sensors: design study, arXiv, pp. 1-7, 2020, DOI: <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.48550/arXiv.2003.07434">https://doi.org/10.48550/arXiv.2003.07434</ext-link>
				</mixed-citation>
				<element-citation publication-type="journal">
					<person-group person-group-type="author">
						<name>
							<surname>Maghdid</surname>
							<given-names>H.S.</given-names>
						</name>
						<name>
							<surname>Ghafoor</surname>
							<given-names>K.Z.</given-names>
						</name>
						<name>
							<surname>Sadiq</surname>
							<given-names>A.S.</given-names>
						</name>
						<name>
							<surname>Curran</surname>
							<given-names>K.</given-names>
						</name>
						<name>
							<surname>Rawat</surname>
							<given-names>D.B.</given-names>
						</name>
						<name>
							<surname>Rabie</surname>
							<given-names>K.</given-names>
						</name>
					</person-group>
					<article-title>A novel AI-enabled framework to diagnose coronavirus COVID-19 using smartphone embedded sensors: design study</article-title>
					<source>arXiv</source>
					<fpage>1</fpage>
					<lpage>7</lpage>
					<year>2020</year>
					<ext-link ext-link-type="uri" xlink:href="https://doi.org/10.48550/arXiv.2003.07434">https://doi.org/10.48550/arXiv.2003.07434</ext-link>
				</element-citation>
			</ref>
			<ref id="B7">
				<label>[7]</label>
				<mixed-citation>[7] Alimadadi, A., Aryal, S., Manandhar, I., Munroe, P.B., Joe, B., and Cheng, X., Artificial intelligence and machine learning to fight Covid-19, Physiol. Genomics, 52(4), pp. 200-202, 2020. DOI: <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1152/physiolgenomics.00029.2020">https://doi.org/10.1152/physiolgenomics.00029.2020</ext-link>.</mixed-citation>
				<element-citation publication-type="journal">
					<person-group person-group-type="author">
						<name>
							<surname>Alimadadi</surname>
							<given-names>A.</given-names>
						</name>
						<name>
							<surname>Aryal</surname>
							<given-names>S.</given-names>
						</name>
						<name>
							<surname>Manandhar</surname>
							<given-names>I.</given-names>
						</name>
						<name>
							<surname>Munroe</surname>
							<given-names>P.B.</given-names>
						</name>
						<name>
							<surname>Joe</surname>
							<given-names>B.</given-names>
						</name>
						<name>
							<surname>Cheng</surname>
							<given-names>X.</given-names>
						</name>
					</person-group>
					<article-title>Artificial intelligence and machine learning to fight Covid-19, Physiol</article-title>
					<source>Genomics</source>
					<volume>52</volume>
					<issue>4</issue>
					<fpage>200</fpage>
					<lpage>202</lpage>
					<year>2020</year>
					<ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1152/physiolgenomics.00029.2020">https://doi.org/10.1152/physiolgenomics.00029.2020</ext-link>
				</element-citation>
			</ref>
			<ref id="B8">
				<label>[8]</label>
				<mixed-citation>[8] Zoabi, Y., and Shomron, N., COVID-19 diagnosis prediction by symptoms of tested individuals : a machine learning approach, NPJ Digital Medicine, May, art. 93948, 2020. DOI: <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1101/2020.05.07.20093948">https://doi.org/10.1101/2020.05.07.20093948</ext-link>.</mixed-citation>
				<element-citation publication-type="book">
					<person-group person-group-type="author">
						<name>
							<surname>Zoabi</surname>
							<given-names>Y.</given-names>
						</name>
						<name>
							<surname>Shomron</surname>
							<given-names>N.</given-names>
						</name>
					</person-group>
					<source>COVID-19 diagnosis prediction by symptoms of tested individuals : a machine learning approach</source>
					<publisher-name>NPJ Digital Medicine</publisher-name>
					<year>2020</year>
					<ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1101/2020.05.07.20093948">https://doi.org/10.1101/2020.05.07.20093948</ext-link>
				</element-citation>
			</ref>
			<ref id="B9">
				<label>[9]</label>
				<mixed-citation>[9] Alafif, T. and Bajaba, S., Machine and deep learning towards COVID-19 diagnosis and treatment: survey, Challenges, November, art. 47848, 2020, DOI: <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.13140/RG.2.2.20805.47848/1">https://doi.org/10.13140/RG.2.2.20805.47848/1</ext-link>.</mixed-citation>
				<element-citation publication-type="book">
					<person-group person-group-type="author">
						<name>
							<surname>Alafif</surname>
							<given-names>T.</given-names>
						</name>
						<name>
							<surname>Bajaba</surname>
							<given-names>S.</given-names>
						</name>
					</person-group>
					<source>Machine and deep learning towards COVID-19 diagnosis and treatment: survey</source>
					<publisher-name>Challenges</publisher-name>
					<year>2020</year>
					<ext-link ext-link-type="uri" xlink:href="https://doi.org/10.13140/RG.2.2.20805.47848/1">https://doi.org/10.13140/RG.2.2.20805.47848/1</ext-link>
				</element-citation>
			</ref>
			<ref id="B10">
				<label>[10]</label>
				<mixed-citation>[10] Zoabi, Y., Deri-Rozov, S. and Shomron, N., Machine learning-based prediction of COVID-19 diagnosis based on symptoms. npj Digit. Med. 4(1), 2021. DOI: <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1038/s41746-020-00372-6">https://doi.org/10.1038/s41746-020-00372-6</ext-link>.</mixed-citation>
				<element-citation publication-type="journal">
					<person-group person-group-type="author">
						<name>
							<surname>Zoabi</surname>
							<given-names>Y.</given-names>
						</name>
						<name>
							<surname>Deri-Rozov</surname>
							<given-names>S.</given-names>
						</name>
						<name>
							<surname>Shomron</surname>
							<given-names>N.</given-names>
						</name>
					</person-group>
					<article-title>Machine learning-based prediction of COVID-19 diagnosis based on symptoms</article-title>
					<source>npj Digit. Med</source>
					<volume>4</volume>
					<issue>1</issue>
					<year>2021</year>
					<ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1038/s41746-020-00372-6">https://doi.org/10.1038/s41746-020-00372-6</ext-link>
				</element-citation>
			</ref>
			<ref id="B11">
				<label>[11]</label>
				<mixed-citation>[11] Chen, Y. et al., An interpretable machine learning framework for accurate severe vs non-severe COVID-19 clinical type classification, medRxiv, 2020. DOI: <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1101/2020.05.18.20105841">https://doi.org/10.1101/2020.05.18.20105841</ext-link>.</mixed-citation>
				<element-citation publication-type="book">
					<person-group person-group-type="author">
						<name>
							<surname>Chen</surname>
							<given-names>Y.</given-names>
						</name>
						<etal/>
					</person-group>
					<source>An interpretable machine learning framework for accurate severe vs non-severe COVID-19 clinical type classification</source>
					<publisher-name>medRxiv</publisher-name>
					<year>2020</year>
					<ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1101/2020.05.18.20105841">https://doi.org/10.1101/2020.05.18.20105841</ext-link>
				</element-citation>
			</ref>
			<ref id="B12">
				<label>[12]</label>
				<mixed-citation>[12] Ahamad, M.M. et al., A machine learning model to identify early stage symptoms of SARS-Cov-2 infected patients, Expert Syst. Appl., 160, art. 113661, 2020. DOI: <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1016/j.eswa.2020.113661">https://doi.org/10.1016/j.eswa.2020.113661</ext-link>.</mixed-citation>
				<element-citation publication-type="journal">
					<person-group person-group-type="author">
						<name>
							<surname>Ahamad</surname>
							<given-names>M.M.</given-names>
						</name>
						<etal/>
					</person-group>
					<article-title>A machine learning model to identify early stage symptoms of SARS-Cov-2 infected patients</article-title>
					<source>Expert Syst. Appl.</source>
					<volume>160</volume>
					<year>2020</year>
					<ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1016/j.eswa.2020.113661">https://doi.org/10.1016/j.eswa.2020.113661</ext-link>
				</element-citation>
			</ref>
			<ref id="B13">
				<label>[13]</label>
				<mixed-citation>[13] Khanday, A.M.U.D., Rabani, S.T., Khan, Q.R., Rouf, N., and Mohi Ud Din, M., Machine learning based approaches for detecting COVID-19 using clinical text data, Int. J. Inf. Technol., 12(3), pp. 731-739, 2020. DOI: <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1007/s41870-020-00495-9">https://doi.org/10.1007/s41870-020-00495-9</ext-link>.</mixed-citation>
				<element-citation publication-type="journal">
					<person-group person-group-type="author">
						<name>
							<surname>Khanday</surname>
							<given-names>A.M.U.D.</given-names>
						</name>
						<name>
							<surname>Rabani</surname>
							<given-names>S.T.</given-names>
						</name>
						<name>
							<surname>Khan</surname>
							<given-names>Q.R.</given-names>
						</name>
						<name>
							<surname>Rouf</surname>
							<given-names>N.</given-names>
						</name>
						<name>
							<surname>Mohi Ud Din</surname>
							<given-names>M.</given-names>
						</name>
					</person-group>
					<article-title>Machine learning based approaches for detecting COVID-19 using clinical text data</article-title>
					<source>Int. J. Inf. Technol</source>
					<volume>12</volume>
					<issue>3</issue>
					<fpage>731</fpage>
					<lpage>739</lpage>
					<year>2020</year>
					<ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1007/s41870-020-00495-9">https://doi.org/10.1007/s41870-020-00495-9</ext-link>
				</element-citation>
			</ref>
			<ref id="B14">
				<label>[14]</label>
				<mixed-citation>[14] Smarr, B.L. et al., Feasibility of continuous fever monitoring using wearable devices, Sci. Rep., 10(1), art. 21640, 2020. DOI: <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1038/s41598-020-78355-6">https://doi.org/10.1038/s41598-020-78355-6</ext-link>.</mixed-citation>
				<element-citation publication-type="journal">
					<person-group person-group-type="author">
						<name>
							<surname>Smarr</surname>
							<given-names>B.L.</given-names>
						</name>
						<etal/>
					</person-group>
					<article-title>Feasibility of continuous fever monitoring using wearable devices</article-title>
					<source>Sci. Rep.</source>
					<volume>10</volume>
					<issue>1</issue>
					<year>2020</year>
					<ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1038/s41598-020-78355-6">https://doi.org/10.1038/s41598-020-78355-6</ext-link>
				</element-citation>
			</ref>
			<ref id="B15">
				<label>[15]</label>
				<mixed-citation>[15] Usha-Ruby, A., Theerthagiri, P., Jeena-Jacob, I., and Vamsidhar, Y., Binary cross entropy with deep learning technique for image classification, Int. J. Adv. Trends Comput. Sci. Eng., 9(4), pp. 5393-5397, 2020. DOI: <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.30534/ijatcse/2020/175942020">https://doi.org/10.30534/ijatcse/2020/175942020</ext-link>.</mixed-citation>
				<element-citation publication-type="journal">
					<person-group person-group-type="author">
						<name>
							<surname>Usha-Ruby</surname>
							<given-names>A.</given-names>
						</name>
						<name>
							<surname>Theerthagiri</surname>
							<given-names>P.</given-names>
						</name>
						<name>
							<surname>Jeena-Jacob</surname>
							<given-names>I.</given-names>
						</name>
						<name>
							<surname>Vamsidhar</surname>
							<given-names>Y.</given-names>
						</name>
					</person-group>
					<article-title>Binary cross entropy with deep learning technique for image classification</article-title>
					<source>Int. J. Adv. Trends Comput. Sci. Eng</source>
					<volume>9</volume>
					<issue>4</issue>
					<fpage>5393</fpage>
					<lpage>5397</lpage>
					<year>2020</year>
					<ext-link ext-link-type="uri" xlink:href="https://doi.org/10.30534/ijatcse/2020/175942020">https://doi.org/10.30534/ijatcse/2020/175942020</ext-link>
				</element-citation>
			</ref>
			<ref id="B16">
				<label>[16]</label>
				<mixed-citation>[16] Valencia, A.M., Construcción de la distribución de pérdidas y el problema de agregación de riesgo operativo bajo modelos LDA: una revisión, Revista Ingenierías Universidad de Medellín, 12(23), pp. 71-82, 2013.</mixed-citation>
				<element-citation publication-type="journal">
					<person-group person-group-type="author">
						<name>
							<surname>Valencia</surname>
							<given-names>A.M.</given-names>
						</name>
					</person-group>
					<article-title>Construcción de la distribución de pérdidas y el problema de agregación de riesgo operativo bajo modelos LDA: una revisión</article-title>
					<source>Revista Ingenierías Universidad de Medellín</source>
					<volume>12</volume>
					<issue>23</issue>
					<fpage>71</fpage>
					<lpage>82</lpage>
					<year>2013</year>
				</element-citation>
			</ref>
			<ref id="B17">
				<label>[17]</label>
				<mixed-citation>[17] Wang, Z. and Bovik, A.C., Mean squared error: Love it or leave it?. A new look at signal fidelity measures, IEEE Signal Process. Mag., 6(1), pp. 98-117, 2009, DOI: <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1109/MSP.2008.930649">https://doi.org/10.1109/MSP.2008.930649</ext-link>.</mixed-citation>
				<element-citation publication-type="journal">
					<person-group person-group-type="author">
						<name>
							<surname>Wang</surname>
							<given-names>Z.</given-names>
						</name>
						<name>
							<surname>Bovik</surname>
							<given-names>A.C.</given-names>
						</name>
					</person-group>
					<article-title>Mean squared error: Love it or leave it?. A new look at signal fidelity measures</article-title>
					<source>IEEE Signal Process. Mag</source>
					<volume>6</volume>
					<issue>1</issue>
					<fpage>98</fpage>
					<lpage>117</lpage>
					<year>2009</year>
					<ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1109/MSP.2008.930649">https://doi.org/10.1109/MSP.2008.930649</ext-link>
				</element-citation>
			</ref>
			<ref id="B18">
				<label>[18]</label>
				<mixed-citation>[18] Meyer, G.P., An alternative probabilistic interpretation of the huber loss, arXiv:1911.02088v3, Section 2, pp. 5261-5269, 2019, DOI: <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.48550/arXiv.1911.02088">https://doi.org/10.48550/arXiv.1911.02088</ext-link>
				</mixed-citation>
				<element-citation publication-type="book">
					<person-group person-group-type="author">
						<name>
							<surname>Meyer</surname>
							<given-names>G.P.</given-names>
						</name>
					</person-group>
					<source>An alternative probabilistic interpretation of the huber loss</source>
					<publisher-name>arXiv</publisher-name>
					<fpage>5261</fpage>
					<lpage>5269</lpage>
					<year>2019</year>
					<ext-link ext-link-type="uri" xlink:href="https://doi.org/10.48550/arXiv.1911.02088">https://doi.org/10.48550/arXiv.1911.02088</ext-link>
				</element-citation>
			</ref>
			<ref id="B19">
				<label>[19]</label>
				<mixed-citation>[19] Lundberg, S. and Lee, S.-I., A Unified approach to interpreting model predictions, Adv. Neural Inf. Process. Syst., 2017, pp. 4766-4775, 2017.</mixed-citation>
				<element-citation publication-type="journal">
					<person-group person-group-type="author">
						<name>
							<surname>Lundberg</surname>
							<given-names>S.</given-names>
						</name>
						<name>
							<surname>Lee</surname>
							<given-names>S.-I.</given-names>
						</name>
					</person-group>
					<article-title>A Unified approach to interpreting model predictions</article-title>
					<source>Adv. Neural Inf. Process. Syst</source>
					<volume>2017</volume>
					<fpage>4766</fpage>
					<lpage>4775</lpage>
					<year>2017</year>
				</element-citation>
			</ref>
			<ref id="B20">
				<label>[20]</label>
				<mixed-citation>[20] Mangalathu, S., Hwang, S.H. and Jeo, J.S., Failure mode and effects analysis of RC members based on machine-learning-based SHapley Additive exPlanations (SHAP) approach, Eng. Struct., 219, art. 110927, 2020. DOI: <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1016/j.engstruct.2020.110927">https://doi.org/10.1016/j.engstruct.2020.110927</ext-link>.</mixed-citation>
				<element-citation publication-type="journal">
					<person-group person-group-type="author">
						<name>
							<surname>Mangalathu</surname>
							<given-names>S.</given-names>
						</name>
						<name>
							<surname>Hwang</surname>
							<given-names>S.H.</given-names>
						</name>
						<name>
							<surname>Jeo</surname>
							<given-names>J.S.</given-names>
						</name>
					</person-group>
					<article-title>Failure mode and effects analysis of RC members based on machine-learning-based SHapley Additive exPlanations (SHAP) approach</article-title>
					<source>Eng. Struct</source>
					<volume>219</volume>
					<fpage>110927</fpage>
					<lpage>110927</lpage>
					<year>2020</year>
					<ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1016/j.engstruct.2020.110927">https://doi.org/10.1016/j.engstruct.2020.110927</ext-link>
				</element-citation>
			</ref>
			<ref id="B21">
				<label>[21]</label>
				<mixed-citation>[21] Štrumbelj, E. and Kononenko, I., Explaining prediction models and individual predictions with feature contributions, Knowl. Inf. Syst., 41(3), pp. 647-665, 2014. DOI: <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1007/S10115-013-0679-X">https://doi.org/10.1007/S10115-013-0679-X</ext-link>.</mixed-citation>
				<element-citation publication-type="journal">
					<person-group person-group-type="author">
						<name>
							<surname>Štrumbelj</surname>
							<given-names>E.</given-names>
						</name>
						<name>
							<surname>Kononenko</surname>
							<given-names>I.</given-names>
						</name>
					</person-group>
					<article-title>Explaining prediction models and individual predictions with feature contributions</article-title>
					<source>Knowl. Inf. Syst.</source>
					<volume>41</volume>
					<issue>3</issue>
					<fpage>647</fpage>
					<lpage>665</lpage>
					<year>2014</year>
					<ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1007/S10115-013-0679-X">https://doi.org/10.1007/S10115-013-0679-X</ext-link>
				</element-citation>
			</ref>
		</ref-list>
		<fn-group>
			<fn fn-type="other" id="fn1">
				<label>How to cite:</label>
				<p> Ramirez-Bautista, J.A., Chaparro-Cárdenas, S.L., Gamboa-Contreras, W., Guerrero-Salazar, W. and Huerta-Ruelas, J.A., Classification of COVID-19 associated symptomatology using machine learning. DYNA, 90(226), pp. 36-43, April - June, 2023.</p>
			</fn>
		</fn-group>
		<fn-group>
			<fn fn-type="other" id="fn2">
				<label>J.A. Ramirez-Bautista,</label>
				<p> is BSc. Eng. in Electronic Engineer from the Fundación Universitaria de San Gil (UNISANGIL), San Gil, Colombia, in 2013. The MSc. in advanced technology, and the PhD. in advanced technology, with a specialty in mechatronics, from the Research Center for Applied Science and Advanced Technology (CICATA), Instituto Politécnico Nacional, Queretaro, Mexico, in 2016 and 2020. He is working on both the design and programming of clinical decision support systems using deep neural networks and fuzzy Systems. His research interests include fuzzy systems, hybrid systems, interface development, neural networks, and clinical decision support systems. He is currently a full-time professor at the Faculty of Natural Sciences and Engineering of UNISANGIL, Colombia. ORCID: 0000-0002-6472-5751</p>
			</fn>
			<fn fn-type="other" id="fn3">
				<label>S.L. Chaparro-Cárdenas,</label>
				<p> is BSc. Eng. in Electronic Engineer from the Fundación Universitaria de San Gil (UNISANGIL), San Gil, Colombia, in 2013. The MSc. in advanced technology, and the PhD. in advanced technology, with a specialty in mechatronics, from the Research Center for Applied Science and Advanced Technology (CICATA), Instituto Politécnico Nacional, Queretaro, Mexico, in 2016 and 2021. She was recognized nationally by the Colombian Association of Engineers (ACIEM), node Santander, with the best graduation project 2013-2014. She is currently a professor and researcher in UNISANGIL, Colombia. Her research interests include fuzzy systems, hybrid systems, robotic rehabilitation devices, neural networks, intelligent control and electrophysiology. ORCID: 0000-0002-2589-259X</p>
			</fn>
			<fn fn-type="other" id="fn4">
				<label>W. Gamboa-Contreras,</label>
				<p> is BSc. Eng. in Electronic Engineer from the Universidad Industrial de Santander (2002), Sp. in Senior Management from the Universidad Industrial de Santander-UIS, Colombia, (2008), MSc., Technology and Innovation Management from the Universidad de Santander-UDES, Colombia, (2020). 18 years of experience as a university teacher and researcher in agro-industrial science and technology and bioengineering. Inventor of 5 patents granted and one under examination, 2 software registrations, two registered trademarks and two pilot plants. National and international recognitions: National INNOVATE 2020 award from ECOPETROL and UNIRED National Engineering Award (ACOFI 2010), National Innovation Award (Seguros la Equidad). ORCID: 0000-0001-5526-3156</p>
			</fn>
			<fn fn-type="other" id="fn5">
				<label>W. Guerrero-Salazar,</label>
				<p> is BSc. Eng. in Agricultural Engineer and Business Administrator from the University Foundation of San Gil - UNISANGIL, Colombia (1998-2003), Sp. in Environmental Chemistry from the Industrial University of Santander - UIS, Colombia (2015). He has 16 years of experience as a teacher and university researcher in science and technology, as well as experience in entrepreneurship. Inventor of 1 patent. National awards: ACOFI National Award 2009-2011. ORCID: 0000-0002-2441-5441</p>
			</fn>
			<fn fn-type="other" id="fn6">
				<label>J.A. Huerta-Ruelas,</label>
				<p> received the MSc. in solid state physics and the PhD. in electrical engineering from the Autonomous University of San Luis Potosi, San Luis Potosi, Mexico, in 1995 and 2000, respectively. He held a Postdoctoral Fellowship in the Department of Science and Food Technology, Oregon State University, Corvallis, OR, USA, in 2004. He is a professor with the Advanced Technology Graduate Program, teaching: optical characterization techniques, interaction of radiation with matter, and the writing and publishing of technical and scientific documents. From 2010-2013, he was the director of Centro de Investigación en Ciencia Aplicada y Tecnología Avanzada, Instituto Politécnico Nacional, Querétaro México. He is currently a member of the National System of Researchers. His current research focuses on the development of optical measuring systems for use in research and industrial process control. ORCID: 0000-0001-5632-3368</p>
			</fn>
		</fn-group>
	</back>
</article>