DYNA

0012-7353

2346-2183

Universidad Nacional de Colombia

https://doi.org/10.15446/dyna.v89n224.103666

Recibido: 13 de julio de 2022; Revision Received: 24 de agosto de 2022; Aceptado: 9 de septiembre de 2022

Monitoring overdispersed process in clinical laboratories using control charts

Monitoreo de procesos sobredispersos en laboratorios clínicos usando cartas de control

J. Valdés-Manuel, ^a

J. Cogollo-Flórez, ^b

a Tecnológico de Estudios Superiores de Jocotitlán, Jocotitlán, México. 2017150481112@tesjo.edu.mx Tecnológico de Estudios Superiores de Jocotitlán Tecnológico de Estudios Superiores de Jocotitlán Jocotitlán Mexico 2017150481112@tesjo.edu.mx

b Instituto Tecnológico Metropolitano - ITM, Medellín, Colombia. juancogollo@itm.edu.co Instituto Tecnológico Metropolitano Instituto Tecnológico Metropolitano Medellín Colombia juancogollo@itm.edu.co

Abstract

Overdispersion is a phenomenon that generally occurs in the analysis of large sample sizes. In discrete data analysis, it refers to the presence of a variation higher than that implied by a reference Binomial or Poisson distributions. The proportion of nonconforming units in clinical laboratories presents high variability and, generally, overdispersion. Therefore, it is required to analyze the most appropriate control charts that overcome the limitations of traditional control charts to deal with overdispersed data. This paper performs an analysis of monitoring overdispersed process in clinical laboratories using control charts. The methodology consists of four steps: (i) Determination of the interest variable, (ii) Diagnosis of data overdispersion, (iii) Elaboration of control charts, and (iv) Analysis of results. The results show that the methodology can quantitatively determine the degree of data overdispersion and select the most appropriate control chart for monitoring the process.

Keywords:

clinical process monitoring, control charts improvement, overdispersed data analysis, statistical engineering.

Resumen:

La sobredispersión es un fenómeno que se produce generalmente en el análisis de muestras de gran tamaño. Se refiere, en el análisis de datos discretos, a la presencia de una variación superior a la que implica una distribución binomial o de Poisson de referencia. La proporción de unidades no conformes en los laboratorios clínicos presenta una alta variabilidad y, generalmente, sobredispersión. Por ello, se requiere analizar las cartas de control más adecuadas que superen las limitaciones de cartas tradicionales para tratar datos sobredispersos. En este trabajo se realiza un análisis del monitoreo de procesos sobredispersos en laboratorios clínicos usando cartas de control. La metodología consta de cuatro pasos: (i) Determinación de la variable de interés, (ii) Diagnóstico de la sobredispersión de los datos, (iii) Elaboración de cartas de control, y (iv) Análisis de los resultados. Los resultados muestran que la metodología permite determinar cuantitativamente el grado de sobredispersión de los datos y seleccionar el gráfico de control más adecuado para monitorear el proceso.

Palabras clave:

monitoreo de procesos clínicos, mejoramiento de cartas de control, análisis de datos sobredispersos, ingeniería estadística.

1. Introduction

Improving healthcare quality requires permanent monitoring of the processes performance and implementation of changes to benefit users. In this regard, data analysis approaches can be implemented, but this is not a simple matter from the process performance measurement in the healthcare sector. Generally, it is required to meet some assumptions and conditions. One of the most relevant data characteristics in the health sector is the high variability in the results and the sample sizes [1].

Furthermore, the application of statistical quality control tools in the analysis of healthcare processes is not yet intensive due to the following aspects [2]: (i) There is resistance to accepting that an approach to improve the quality of industrial processes can be applied to healthcare, (ii) Statistical quality control is missing from the most popular books on medical statistics, and, (iii) Statistical quality control faces fundamental assumptions about how to develop documented improvements in healthcare.

However, quality control generates application interest in medicine and healthcare. For this, the traditional concepts of quality control of industrial processes have been adapted and transformed to be useful in monitoring health processes [3].

According to [4], control charts are used widely in different sectors such as banks, hospitals, and financial services, among others. Control charts can be applied in the healthcare sector to improve outpatient care, operating room efficiency, and also to reduce unnecessary costs and length of stay. They can also contribute to the need to monitor and control healthcare performance and minimize adverse events [5].

Additionally, control charts can help stakeholders manage change in healthcare and improve patient health [6]. Control charts allow locating and identifying the root cause of process variability to eradicate or control it [7]. The causes of variability can be classified into common and special causes.

Common or random causes are inherent to the process characteristics and eliminating or reducing them depends on modifying the system. Special or assignable causes are due to situations outside the process, that is, external factors that can be identified and eliminated [8].

The main objective of control charts is to monitor and analyze the variability behavior of a process over time [9]. Depending on the type of data, control charts can be classified into two main groups: control charts for variables (continuous data) and control charts for attributes (discrete data) [10]. This paper focuses on the control charts for attributes, specifically on the p or defective ratio control chart.

The elaboration of a p-chart is based on the fulfillment of several assumptions. First, the data are assumed to have a binomial distribution with independent events having a constant underlying probability of occurrence. In addition, the mean defective proportion must be constant over time, implying that it should not vary between subgroups [11].

Overdispersed data occur when there is excessive variability within or between subgroups. Overdispersion makes it difficult to monitor processes with traditional p-charts since some subgroups are likely to be out of control when they are not (type 1 error). Overdispersed data do not satisfy the constant defectives proportion assumption and are characteristic of processes with large sample sizes.

Monitoring of overdispersed data requires a control chart that considers variation within and between subgroups. Thus, it is possible to differentiate between common and special causes of variation [12]. For example, an analysis of data on survival after coronary artery bypass grafts, emergency readmission rates, and teenage pregnancies is made in [13]. It is established that the treatment of overdispersed data leads to the conclusion that these processes are not under control using the traditional control chart approach.

In [14], control charts are used to monitor infections and mortality in surgical facilities, concluding that when dealing with large sample sizes there could be dependence between results, and new statistical methods need to be applied. Traditional control charts have limitations for analyzing infrequent events [6].

Also, in [15], a study on the number of patients seen in the first four hours in accident and emergency departments was performed. It concludes that traditional control charts lose effectiveness when dealing with overdispersed data, and a combined charting strategy should be used.

Although several previous academic works analyze the problem of overdispersed data in traditional control charts [16], the research gap on control chart applications suitable for this data characteristic remains considerable. Therefore, in this article, we analyze the monitoring of overdispersed processes in clinical laboratories and determine the most appropriate data analysis procedure and control charts.

This work constitutes an academic contribution to the processes analysis under conditions closer to reality and promotes the use of appropriate statistical tools that allow the treatment of data that do not meet the assumptions of traditional control charts.

2. Methodology

The methodology includes four steps: (i) Determination of the interest variable, (ii) Diagnosis of data overdispersion, (iii) Elaboration of control charts, and (iv) Analysis of results.

2.1. Determination of the interest variable

The sample processing in clinical laboratories requires adequate sample collection for the following tests and avoiding false negatives and false positives in the results. When the sample does not meet the established requirements, it is discarded and becomes a defective product from the perspective of statistical quality control.

Discarded samples generate waste of clinical laboratory resources and can cause user dissatisfaction because a new sampling is required. Therefore, it is essential to implement programs to monitor and control the sampling effectiveness to prevent errors in clinical tests and make decisions to improve the health service.

The data of this article are from a clinical laboratory in Colombia, with continuous monitoring for one year. The interesting variable is the number of clinical samples discarded per week, with a total monitoring time of 50 weeks. Thus, the monitoring parameter (p) in the control charts is the proportion of samples discarded in each subgroup (week).

2.2. Diagnosis of data overdispersion

The diagnosis of process data overdispersion is performed using the Jones and Govindaraju [17] and Heimann [18] methods.

Jones and Govindaraju [17] proposed a graphical method based on the variance ratio test (VRT), comparing the observed variation in a data sample and the expected variation of the corresponding binomial distribution. Let d_i be considered a number of nonconforming units in i subgroups of size n that follow a binomial distribution with parameter p (proportion of the nonconforming units), then:

follows a normal distribution with mean and variance :

where $\bar{p}$ is the average proportion of nonconforming units:

The procedure of this method begins by transforming data using (1) and then constructing the normal probability plot of the transformed data and its fitting line. Subsequently, the actual variation of the process is estimated as the distance on the y _i axis that intercepts with the scores +1 y -1 of Z in the fitting line. If the actual variation is greater than the expected variation, estimated as 1.5/√n, it is concluded that the data are overdispersed and it is not possible to state that they follow a binomial distribution.

On the other hand, Heimann [18] proposes a method based on decomposing the total variance and its representation as the sum of the sampling variance and the process variance. The sampling variance represents the difference between the estimate of the probability of producing nonconforming units (from the sample) and the actual value (from the process). The underlying process variance represents the variation in the probability of producing nonconforming units.

Thus, the sampling variance is determined by:

Similarly, let $p_{i}$ be the proportion of nonconforming units for each subgroup i, with a total of m subgroups, the total variance is determined as follows:

and the underlying process variance results from:

Then, r is defined as the variance ratio, between the total variance and the sampling variance:

If r > 1.357, it is concluded that there is extra variability in the data, greater than a binomial distribution implies, and it is not appropriate to choose a p-chart for process monitoring and control.

2.3. Elaboration of control charts

Various control chart types are developed and compared in order to select the most suitable one for monitoring overdispersed processes in clinical laboratories. Therefore, we develop the p-chart with variable limits, normalized p-chart (Z), moving range chart (X-MR), p’-chart of Laney, and the chart proposed by Goedhart and Woodall.

The construction procedures of the first three control charts are not described in this article since they are widely known and extensively detailed in textbooks on statistical quality control [10] [19]. The construction procedures for the last two control charts are detailed below.

2.3.1. p´-chart of Laney

This control chart considers both intra-sample and inter-sample variation, with a multiplicative effect for calculating the control limits (CLs) [11]:

Where σ_z is the inter-sample variation and σ_pi is the intra-sample variation:

where $n_{i}$ is the variable sample size, is the average proportion, and is the average moving range of the z-scores for each subgroup, z_i :

2.3.2. p-chart of Goedhart and Woodall

In this control chart, the control limits (CLs) are calculated based on an adding ratio of the intra-sample and inter-sample variations, considering the average standard deviation of the proportion [20]:

where is the intra-sample variance, is the inter-sample variance and ${\bar{σ}}_{p}^{2}$ is the mean variance of the proportions.

3. Results

3.1. Data description of the interest variable

Table 1 shows the data collected from monitoring the total number of samples collected and the number of samples discarded per week in the clinical laboratory under study. The average proportion of nonconforming units, is equal to 0.05. This relatively low value, regarding the standards set by the laboratory, could lead to preliminary inferences that the process has large sample sizes or a low number of nonconforming units per subgroup.

Table 1: Total samples and Nonconforming samples per week.

Week	Total samples	Nonconforming Samples	Week	Total samples	Nonconforming Samples
1	1467	105	26	1591	34
2	1789	68	27	1726	112
3	1345	140	28	1279	25
4	2347	217	29	2430	132
5	1734	61	30	275	6
6	2378	158	31	1784	96
7	1893	80	32	2694	74
8	2992	260	33	3583	101
9	1935	81	34	6945	276
10	1967	87	35	2895	184
11	1524	98	36	7492	351
12	2592	91	37	1764	155
13	3890	254	38	3790	226
14	1469	79	39	2578	101
15	2693	67	40	3895	114
16	2936	72	41	4895	415
17	1798	66	42	2654	78
18	2348	109	43	4569	62
19	1633	67	44	2697	200
20	1376	80	45	2589	143
21	1234	88	46	2478	62
22	1357	98	47	2737	49
23	1594	158	48	1839	129
24	3789	238	49	1426	84
25	1598	135	50	1925	75
			TOTAL	124208	6241

Source: The authors.

Table 1 shows that both the sample sizes and the number of nonconforming units per subgroup have a high variation. However, the average sample variance is 0.00002559, a relatively low value. This requires a detailed analysis of the data dispersion, as detailed in the next step.

3.2. Results of data overdispersion diagnosis

Fig. 1 shows the normal probability plot of the data transformed by applying the Jones and Govindaraju method, described in section 2.2. According to the results, the actual process variation is equal to 0.1073 and the expected variation is equal to 0.0905. Since the actual variance is greater than the expected variance, it is possible to state that the process data are overdispersed.

Moreover, applying the Heimann method, it is obtained that = 0.000026 and = 0.000537. Therefore, the variance ratio r is equal to 20.65. Since r > 1.357, it can be stated that the data have a higher variation than expected for a binomial distribution.

Finally, it is possible to state that the highest variation corresponds to the underlying process variation. Data overdispersion is identified in both diagnostic methods, therefore, it is not appropriate to choose a traditional p-chart for process monitoring and control.

3.3 Elaboration of control charts

In order to compare performance and select the most appropriate control chart(s) for monitoring overdispersed processes in clinical laboratories, we developed the p-chart with variable limits (Fig. 2), the normalized p-chart (Fig. 3), X-MR chart (Fig. 4), p’-chart of Laney (Fig. 5) and the p-chart of Goedhart and Woodall (Fig. 6). The control charts were made using Microsoft Excel^©.

3.4. Analysis of results

Traditional control charts have limitations for monitoring overdispersed data due to the assumptions required for their construction [21]. The p-chart with variable limits and the normalized p-chart assume a binomial distribution of the data and that the mean remains constant over time.

The above two assumptions are not satisfied in overdispersed data. Therefore, many points fall outside the control limits, as shown in Figs. 2 and 3. Because of that, it is not appropriate to use either of these two control charts for monitoring overdispersed processes since the data assumptions are not satisfied, and type 1 errors may occur.

According to [17], the acceptable solution for decades to monitor overdispersed data was the development of the X-MR chart (Fig. 4). Although this control chart also assumes a binomial distribution of the data, it incorporates compensation for the inter-sample variation by considering the average moving range as a factor for calculating the control limits. However, it does not consider inter-sample variation and presents constant control limits.

The p'-chart of Laney (Fig. 5) and the p-chart of Goedhart and Woodall (Fig. 6) overcome the limitations of previous control charts and have variable control limits that adjust depending on inter-sample and intra-sample variation. As mentioned in section 2.3, the main difference between these two control charts lies in the effect of inter-sample and intra-sample variations in calculating the control limits. While the p'-chart of Laney has a multiplicative approach, the p-chart of Goedhart and Woodall has an additive approach.

The latter is shown in Fig. 7. The multiplicative effect of the variances causes higher variability in the width of the control limits in the Laney control chart. The maximum range of variation of the control limits, obtained as the difference between the extreme values of the control limits in each control chart, is higher in the p'-chart of Laney (0.255) than in the p-chart of Goedhart and Woodall (0.129). The p-chart of Goedhart and Woodall has control limits that do not overestimate the effect of variances and therefore have lower variability over time.

p´-chart of Laney versus p-chart of Goedhart & Woodall.

Finally, Table 2 shows a comparative summary of the control charts analyzed based on the results of this work. The decision on the control chart adequate to monitor overdispersed processes depends on fulfilling two main conditions: (i) The control chart must be applicable without data distributional, and stability of the process mean assumptions, and (ii) The control chart must consider both intra-sample and inter-sample variances.

Table 2: Comparative overview of control charts for monitoring overdispersed processes.

Control chart	Applicable without data distributional assumptions?	Consider both intra-sample and inter-sample variances?	Adequate for overdispersed data monitoring?
p-chart	No	No	No
Z-chart	No	No	No
X-MR chart	No	No	No
p´-Laney	Yes	Yes	Yes
p-Goedhart & Woodall	Yes	Yes	Yes

Source: The authors.

4. Conclusions

Monitoring the proportion of nonconforming units in clinical laboratories is important for quality assurance in patient service and proper process management. Process data in clinical laboratories present high variability because the number of samples taken varies over time since it depends on patient demand. Human intervention in sample collection and processing also increases the variability of the results.

Excessive variability in clinical processes means that the data are overdispersed and do not fulfill the assumptions required for monitoring using traditional control charts. In these cases, using control charts can lead to erroneous conclusions about the process behavior. For this reason, it is necessary to develop comprehensive studies that consider the data characteristics for implementing quality control tools in the health sector.

In this paper, we performed a comparative study of the application of different control charts for monitoring overdispersed processes in clinical laboratories. We also proposed a methodological scheme for adequate monitoring, focused on the diagnosis of data overdispersion through a graphical and an analytical method.

The proposed methodological approach and the developed case study led to conclude that the main criterion for applying control charts in overdispersed processes is to consider both the inter-sample and intra-sample variations and their effect on the calculation of the control limits.

This article is a product of ongoing research whose main objective is to improve statistical process management programs in the health sector. The following research stage will apply the methodology considering other attributes such as non-conformities.

It is also of interest to make applications in areas such as patient care in hospitals, monitoring of drug prescriptions, climatic or environmental phenomena, and, overall, processes where the sample size has a high variability between batches.

References

[1] Benneyan, J.C., Lloyd, R.C. and Plsek, P.E., Statistical process control as a tool for research and healthcare improvement. BMJ Quality & Safety, 12(6), pp. 458-464, 2003. DOI: https://doi.org/https:/doi.org/10.1136/qhc.12.6.458 [URL] 🠔
[2] Mohammed, M.A., Using statistical process control to improve the quality of health care. BMJ Quality & Safety , 13(4), pp. 243-245, 2004. DOI: https://doi.org/10.1136/qshc.2004.010454 [URL] 🠔
[3] Sos, J.E., Current methods for quality control and monitoring of medical care. Revista de Senología y Patología Mamaria, 27(2), pp. 94-98, 2014. DOI: https://doi.org/10.1016/j.senol.2013.11.001 [URL] 🠔
[4] Zacharia, A. and Ravichandran, J., Six Sigma-based RS , MDSS and MDSRS control charts. International Journal of Six Sigma and Competitive Advantage, 13(4), pp. 435-455, 2021. DOI: https://doi.org/10.1504/IJSSCA.2021.120576 [URL] 🠔
[5] Suman, G. and Prajapati, D., Control chart applications in healthcare: a literature review. International Journal of Metrology and Quality Engineering, 9, art. 5, 2018. DOI: https://doi.org/10.1051/ijmqe/2018003 [URL] 🠔
[6] Thor, J., Lundberg, J., Ask, J., Olsson, J., Carli, C., Härenstam, K.P. and Brommels, M., Application of statistical process control in healthcare improvement: systematic review. BMJ Quality & Safety , 16(5), pp. 387-399, 2007. DOI: https://doi.org/10.1136/qshc.2006.022194 [URL] 🠔
[7] Loaiza-Arbeláez, R.D., Gráficos de control no paramétricos basados en R-estadísticos con aplicación al caso multivariante, BSc. Thesis, Escuela Politécnica Superior, Universidad Carlos III de Madrid, Madrid, Spain, 2013. 🠔
[8] Burr, I.W., Statistical quality control methods. Routledge, New York, USA, 2018. 🠔
[9] Krishnamoorthi, K.S., Krishnamoorthi, V.R. and Pennathur, A., A first course in quality engineering: integrating statistical and management methods of quality. CRC Press, Taylor & Francis Group, Boca Raton, USA, 2018, 599 P. 🠔
[10] Gutiérrez, H. and De la Vara-Salazar, R., Control estadístico de calidad y seis sigma, 3ra ed. McGraw-Hill, México, 2013. 🠔
[11] Laney, D.B., Improved control charts for attributes. Quality Engineering, 14(4), pp. 531-537, 2002. DOI: https://doi.org/10.1081/QEN-120003555 [URL] 🠔
[12] Gutiérrez, H., Calidad total y productividad, 3 ed. México: McGraw-Hill, 2010. 🠔
[13] Spiegelhalter, D.J., Handling over-dispersion of performance indicators. BMJ Quality & Safety , 14(5), pp. 347-351, 2005. DOI: https://doi.org/10.1136/qshc.2005.013755 [URL] 🠔
[14] Woodall, W.H., Fogel, S.L. and Steiner, S.H., The monitoring and improvement of surgical-outcome quality. Journal of Quality Technology, 47(4), pp. 383-399, 2015. DOI: https://doi.org/10.1080/00224065.2015.11918141 [URL] 🠔
[15] Mohammed, M.A., Panesar, J.S., Laney, D.B. and Wilson, R., Statistical process control charts for attribute data involving very large sample sizes: a review of problems and solutions. BMJ Quality & Safety , 22(4), pp. 362-368, 2013. DOI: https://doi.org/10.1136/bmjqs-2012-001373 [URL] 🠔
[16] Vidmar, G. and Blagus, R., Outlier detection for healthcare quality monitoring-a comparison of four approaches to over‐dispersed proportions. Quality and Reliability Engineering International, 30(3), pp. 347-362, 2014. DOI: https://doi.org/10.1002/qre.1581 [URL] 🠔
[17] Jones, G. and Govindaraju, K., A graphical method for checking attribute control chart assumptions. Quality Engineering , 13(1), pp. 19-26, 2001. DOI: https://doi.org/10.1080/08982110108918620 [URL] 🠔
[18] Heimann, P.A., Attributes control charts with large sample sizes. Journal of Quality Technology , 28(4), pp. 451-459, 1996. DOI: https://doi.org/10.1080/00224065.1996.11979703 [URL] 🠔
[19] Montgomery, D.C., Introduction to statistical quality control, 8a ed. John Wiley & Sons, New York, U.S., 2020. 🠔
[20] Goedhart, R. and Woodall, W.H., Monitoring proportions with two components of common cause variation. Journal of Quality Technology , 54(3), pp. 324-337, 2021. DOI: https://doi.org/10.1080/00224065.2021.1903823 [URL] 🠔
[21] Ali, S., Pievatolo, A. and Göb, R., An overview of control charts for high‐quality processes. Quality and Reliability Engineering International , 32(7), pp. 2171-2189, 2016. DOI: https://doi.org/10.1002/qre.1957 [URL] 🠔

How to cite: Valdés-Manuel, J.I. and Cogollo-Flórez, J.M., Monitoring overdispersed process in clinical laboratories using control charts. DYNA, 89(224), pp. 28-33, October - December, 2022.

J.I. Valdés-Manuel, received the BSc. Eng. in Industrial Engineering from the Tecnológico de Estudios Superiores de Jocotitlán, México. He has worked on Six Sigma and Lean Manufacturing implementation projects. Currently, he is a process analyst in Películas Plásticas S.A. de C.V., Atlacomulco, México. His research interests include Statistical Quality Control, Quality Management, and Production and Services Processes Optimization. ORCID: 0000-0002-2286-4753

J.M. Cogollo-Flórez, received the PhD in Engineering - Industry and Organizations and the MSc. in Management Engineering, both from the Universidad Nacional de Colombia. Currently, he is a full professor in the Quality and Production Department, Instituto Tecnológico Metropolitano - ITM, Medellín, Colombia. His research interests include advanced statistical quality control, supply chain quality management, performance measurement, fuzzy modeling, and process quality modeling. ORCID: 0000-0002-6101-3134