Publicado
Robust mixture regression based on the skew t distribution
Mixtura robusta de modelos de regresión basada en la distribución t asimétrica
DOI:
https://doi.org/10.15446/rce.v40n1.53580Palabras clave:
Mixture regression models, robust regression, maximum likelihood, EM algorithm, skew t distribution (en)Algoritmo EM, máxima verosimilitud, mixtura de regresiones, distribución t asimétrica. (es)
Descargas
En este estudio se explora una mixtura robusta de modelos de regresión basada en la distribución t asimétrica, con el propósito de modelar colas pesadas o asimétricas en los errores, en un escenario de mixtura de regresiones. Se usa un algoritmo EM para obtener los estimadores máximo verosímiles empleando una mixtura de escala de la distribución t asimétrica. El comportamiento de los estimadores propuestos se ilustra a través de une estudio de simulación y de un ejemplo con datos reales.
1Giresun University, Faculty of Arts and Sciences, Department of Statistics, Giresun, Turkey. PhD. Email: fatma.dogru@giresun.edu.tr
2Ankara University, Faculty of Science, Department of Statistics, Ankara, Turkey. PhD. Email: oarslan@ankara.edu.tr
In this study, we explore a robust mixture regression procedure based on the skew t distribution in order to model heavy-tailed and/or skewed errors in a mixture regression setting. We present an EM-type algorithm to compute the maximum likelihood estimators for the parameters of interest using the scale mixture representation of the skew t distribution. The performance of proposed estimators is demonstrated by a simulation study and a real data example.
Key words: EM Algorithm, Maximum Likelihood, Mixture Regression Model, Skew t Distribution.
En este estudio se explora una mixtura robusta de modelos de regresión basada en la distribución t asimétrica, con el propósito de modelar colas pesadas o asimétricas en los errores, en un escenario de mixtura de regresiones. Se usa un algoritmo EM para obtener los estimadores máximo verosímiles empleando una mixtura de escala de la distribución t asimétrica. El comportamiento de los estimadores propuestos se ilustra a través de une estudio de simulación y de un ejemplo con datos reales.
Palabras clave: Algoritmo EM, máxima verosimilitud, mixtura de regresiones, distribución t asimétrica.
Texto completo disponible en PDF
References
1. Akaike, H. (1973), Information theory and an extension of the maximum likelihood principle, 'Proceeding of the Second International Symposium on Information Theory', Akademiai Kiado, Budapest, p. 267-281.
2. Azzalini, A. (1986), 'Further results on a class of distributions which includes the normal ones', Statistica 46, 199-208.
3. Azzalini, A. & Capitaino, A. (2003), 'Distributions generated by perturbation of symmetry with emphasis on a multivariate skew t distribution', Journal of the Royal Statistical Society: Series B 65, 367-389.
4. Bai, X. (2010), Robust mixture of regression models, Master's thesis, Kansas State University.
5. Bai, X., Yao, W. & Boyer, J. E. (2012), 'Robust fitting of mixture regression models', Computational Statistics and Data Analysis 56, 2347-2359.
6. Basford, K. E., Greenway, D. R., McLachlan, G. J. & Peel, D. (1997), 'Standard errors of fitted means under normal mixture', Computational Statistics 12, 1-17.
7. Bashir, S. & Carter, E. (2012), 'Robust mixture of linear regression models', Communications in Statistics-Theory and Methods 41, 3371-3388.
8. Bozdogan, H. (1993), Choosing the number of component clusters in the mixture model using a new informational complexity criterion of the inverse-fisher information matrix, 'Information and Classification', Springer Berlin Heidelberg, p. 40-54.
9. Cohen, A. C. (1984), 'Some effects of inharmonic partials on interval perception', Music Perception 1, 323-349.
10. Dempster, A. P., Laird, N. M. & Rubin, D. B. (1977), 'Maximum likelihood from incomplete data via the E-M algorithm', Journal of the Royal Statistical Society: Series B 39, 1-38.
11. Dias, J. G. & Wedel, M. (2004), 'An empirical comparison of em, sem and mcmc performance for problematic gaussian mixture likelihoods', Statistics and Computing 14, 323-332.
12. Dogru, F. Z. (2015), Robust Parameter Estimation in Mixture Regression Models, PhD thesis, Ankara University.
13. Dogru, F. Z. & Arslan, O. (2014), Robust mixture regression modelling based on the skew t distribution, 'International Conference on Robust Statistics (ICORS14)', Martin-Luther-University Halle-Wittenberg/Germany.
14. Gupta, A. (2003), 'Multivariate skew t distribution', Statistics 37, 359-363.
15. Gupta, A., Chang, F. & Huang, W. (2002), 'Some skew symmetric models', Random Operators Stochastic Equations 10, 133-140.
16. Henning, C. (2013), fpc: Flexible procedure for clustering. R Package Version 2.1-5.
17. Henze, N. (1986), 'A probabilistic representation of the skew-normal distribution', Scandinavian Journal of Statistics 13, 271-275.
18. Lange, K. L., Little, J. A. & Taylor, M. G. J. (1989), 'Robust statistical modeling using the t distribution', Journal of the American Statistical Association 84, 881-896.
19. Lin, T. I., Lee, J. C. & Hsieh, W. J. (2007), 'Robust mixture modeling using the skew t distribution', Statistics and Computing 17, 81-92.
20. Liu, M. & Lin, T. I. (2014), 'A skew-normal mixture regression model', Educational and Psychological Measurement 74(1), 139-162.
21. Lucas, A. (1997), 'Robustness of the student t based m-estimator', Communications in Statistics: Theory and Methods 26, 1165-1182.
22. Markatou, M. (2000), 'Mixture models, robustness, and the weighted likelihood methodology', Biometrics 56, 483-486.
23. Peel, D. & McLachlan, G. J. (2000), 'Robust mixture modelling using the t distribution', Statistics and Computing 10(4), 339-348.
24. Pereira, J. R., Marques, L. A. & da Costa, J. M. (2012), 'An empirical comparison of EM initialization methods and model choice criteria for mixtures of skew normal distributions', Revista Colombiana de Estadistica 35(3), 457-478.
25. Quandt, R. E. (1972), 'A new approach to estimating switching regressions', Journal of the American Statistical Association 67, 306-310.
26. Quandt, R. E. & Ramsey, J. B. (1978), 'Estimating mixtures of normal distributions and switching regressions', Journal of the American Statistical Association 73, 730-752.
27. Schwarz, G. (1978), 'Estimating the dimension of a model', Annals of Statistics 6(2), 461-464.
28. Shen, H., Yang, J. & Wang, S. (2004), Outlier detecting in fuzzy switching regression models, 'International Conference on Artificial Intelligence: Methodology, Systems, and Applications', Springer, , , p. 208-215.
29. Song, W., Yao, W. & Xing, Y. (2014), 'Robust mixture regression model fitting by laplace distribution', Computational Statistics and Data Analysis 71, 128-137.
30. Wei, Y. (2012), Robust mixture regression models using t-distribution, Master's thesis, Kansas State University.
31. Yao, W., Wei, Y. & Yu, C. (2014), 'Robust mixture regression using the t-distribution', Computational Statistics and Data Analysis 71, 116-127.
32. Zeller, C. B., Cabral, C. R. B. & Lachos, V. H. (2016), 'Robust mixture regression modeling based on scale mixtures of skew normal distributions', Test 25, 375-396.
33. Zhang, J. (2013), Robust mixture regression modeling with pearson type vii distribution, Master's thesis, Kansas State University.
Este artículo se puede citar en LaTeX utilizando la siguiente referencia bibliográfica de BibTeX:
@ARTICLE{RCEv40n1a09,
AUTHOR = Dogru, Fatma Zehra and Arslan, Olcay},
TITLE = {{Robust Mixture Regression Based on the Skew t Distribution}},
JOURNAL = {Revista Colombiana de Estadística},
YEAR = {2017},
volume = {40},
number = {1},
pages = {45-64}
}
Referencias
Akaike, H. (1973), Information theory and an extension of the maximum likelihood principle, in B. N. Petrov & F. Caski, eds, ‘Proceeding of the Second International Symposium on Information Theory’, Akademiai Kiado, Budapest, pp. 267–281.
Azzalini, A. (1986), ‘Further results on a class of distributions which includes the normal ones’, Statistica 46, 199–208.
Azzalini, A. & Capitaino, A. (2003), ‘Distributions generated by perturbation of symmetry with emphasis on a multivariate skew t distribution’, Journal of the Royal Statistical Society: Series B 65, 367–389.
Bai, X. (2010), Robust mixture of regression models, Master’s thesis, Kansas State University.
Bai, X., Yao, W. & Boyer, J. E. (2012), ‘Robust fitting of mixture regression models’, Computational Statistics and Data Analysis 56, 2347–2359.
Basford, K. E., Greenway, D. R., McLachlan, G. J. & Peel, D. (1997), ‘Standard errors of fitted means under normal mixture’, Computational Statistics 12, 1– 17.
Bashir, S. & Carter, E. (2012), ‘Robust mixture of linear regression models’, Communications in Statistics-Theory and Methods 41, 3371–3388.
Bozdogan, H. (1993), Choosing the number of component clusters in the mixture model using a new informational complexity criterion of the inverse-fisher information matrix, in ‘Information and Classification’, Springer Berlin Heidelberg, pp. 40–54.
Cohen, A. C. (1984), ‘Some effects of inharmonic partials on interval perception’, Music Perception 1, 323–349.
Dempster, A. P., Laird, N. M. & Rubin, D. B. (1977), ‘Maximum likelihood from incomplete data via the E-M algorithm’, Journal of the Royal Statistical Society: Series B 39, 1–38.
Dias, J. G. & Wedel, M. (2004), ‘An empirical comparison of em, sem and mcmc performance for problematic gaussian mixture likelihoods’, Statistics and Computing 14, 323–332.
Dogru, F. Z. (2015), Robust Parameter Estimation in Mixture Regression Models, PhD thesis, Ankara University.
Dogru, F. Z. & Arslan, O. (2014), Robust mixture regression modelling based on the skew t distribution, in ‘International Conference on Robust Statistics (ICORS14)’, Martin-Luther-University Halle-Wittenberg/Germany.
Gupta, A. (2003), ‘Multivariate skew t distribution’, Statistics 37, 359–363.
Fatma Zehra Dogru & Olcay Arslan Gupta, A., Chang, F. & Huang, W. (2002), ‘Some skew symmetric models’, Random
Henning, C. (2013), fpc: Flexible procedure for clustering. R Package Version 2.1-5.
Henze, N. (1986), ‘A probabilistic representation of the skew-normal distribution’, Scandinavian Journal of Statistics 13, 271–275.
Lange, K. L., Little, J. A. & Taylor, M. G. J. (1989), ‘Robust statistical modeling using the t distribution’, Journal of the American Statistical Association 84, 881–896.
Lin, T. I., Lee, J. C. & Hsieh, W. J. (2007), ‘Robust mixture modeling using the skew t distribution’, Statistics and Computing 17, 81–92.
Liu, M. & Lin, T. I. (2014), ‘A skew-normal mixture regression model’, Educational and Psychological Measurement 74(1), 139–162.
Lucas, A. (1997), ‘Robustness of the student t based m-estimator’, Communications in Statistics: Theory and Methods 26, 1165–1182.
Markatou, M. (2000), ‘Mixture models, robustness, and the weighted likelihood methodology’, Biometrics 56, 483–486.
Peel, D. & McLachlan, G. J. (2000), ‘Robust mixture modelling using the t distribution’, Statistics and Computing 10(4), 339–348.
Pereira, J. R., Marques, L. A. & da Costa, J. M. (2012), ‘An empirical comparison of EM initialization methods and model choice criteria for mixtures of skew normal distributions’, Revista Colombiana de Estadistica 35(3), 457–478.
Quandt, R. E. (1972), ‘A new approach to estimating switching regressions’, Journal of the American Statistical Association 67, 306–310.
Quandt, R. E. & Ramsey, J. B. (1978), ‘Estimating mixtures of normal distributions and switching regressions’, Journal of the American Statistical Association 73, 730–752.
Schwarz, G. (1978), ‘Estimating the dimension of a model’, Annals of Statistics 6(2), 461–464.
Shen, H., Yang, J. & Wang, S. (2004), Outlier detecting in fuzzy switching regression models, in ‘International Conference on Artificial Intelligence: Methodology, Systems, and Applications’, Springer, pp. 208–215.
Song, W., Yao, W. & Xing, Y. (2014), ‘Robust mixture regression model fitting by laplace distribution’, Computational Statistics and Data Analysis 71, 128– 137.
Wei, Y. (2012), Robust mixture regression models using t-distribution, Master’s thesis, Kansas State University.
Yao, W., Wei, Y. & Yu, C. (2014), ‘Robust mixture regression using the tdistribution’, Computational Statistics and Data Analysis 71, 116–127.
Zeller, C. B., Cabral, C. R. B. & Lachos, V. H. (2016), ‘Robust mixture regression modeling based on scale mixtures of skew normal distributions’, Test 25, 375–396.
Zhang, J. (2013), Robust mixture regression modeling with pearson type vii distribution, Master’s thesis, Kansas State University.
Cómo citar
APA
ACM
ACS
ABNT
Chicago
Harvard
IEEE
MLA
Turabian
Vancouver
Descargar cita
CrossRef Cited-by
1. Haroon M. Barakat, Abdallh W. Aboutahoun, Naeema El-kadar. (2019). A New Extended Mixture Skew Normal Distribution, With Applications. Revista Colombiana de Estadística, 42(2), p.167. https://doi.org/10.15446/rce.v42n2.70087.
2. Fatma Zehra Doğru, Keming Yu, Olcay Arslan. (2019). Heteroscedastic and heavy-tailed regression with mixtures of skew Laplace normal distributions. Journal of Statistical Computation and Simulation, 89(17), p.3213. https://doi.org/10.1080/00949655.2019.1658111.
3. Chris Adcock, Adelchi Azzalini. (2020). A Selective Overview of Skew-Elliptical and Related Distributions and of Their Applications. Symmetry, 12(1), p.118. https://doi.org/10.3390/sym12010118.
4. Fatma Zehra Doğru, Olcay Arslan. (2021). Robust mixture regression modeling based on the generalized M (GM)-estimation method. Communications in Statistics - Simulation and Computation, 50(9), p.2643. https://doi.org/10.1080/03610918.2019.1610442.
5. H. M. Barakat, M. H. Dwes, Tuncer Acar. (2022). Limit Distributions of Ordered Random Variables in Mixture of Two Gaussian Sequences. Journal of Mathematics, 2022(1) https://doi.org/10.1155/2022/7956195.
6. Víctor Hugo Lachos Dávila, Celso Rômulo Barbosa Cabral, Camila Borelli Zeller. (2018). Finite Mixture of Skewed Distributions. SpringerBriefs in Statistics. , p.77. https://doi.org/10.1007/978-3-319-98029-4_6.
7. H. M. Barakat, M. H. Dwes. (2022). Asymptotic behavior of ordered random variables in mixture of two Gaussian sequences with random index. AIMS Mathematics, 7(10), p.19306. https://doi.org/10.3934/math.20221060.
8. Atefeh Zarei, Zahra Khodadadi, Mohsen Maleki, Karim Zare. (2023). Robust mixture regression modeling based on two-piece scale mixtures of normal distributions. Advances in Data Analysis and Classification, 17(1), p.181. https://doi.org/10.1007/s11634-022-00495-6.
9. Fatma Zehra Doğru, Olcay Arslan. (2017). Parameter estimation for mixtures of skew Laplace normal distributions and application in mixture regression modeling. Communications in Statistics - Theory and Methods, 46(21), p.10879. https://doi.org/10.1080/03610926.2016.1252400.
10. Weizhong Tian, Fengrong Wei, Thomas Brown. (2020). Mixture network autoregressive model with application on students’ successes. Frontiers of Mathematics in China, 15(1), p.141. https://doi.org/10.1007/s11464-020-0813-5.
11. Fatma Zehra Doğru, Olcay Arslan. (2023). Mixture regression modelling based on the shape mixtures of skew Laplace normal distribution. Journal of Statistical Computation and Simulation, 93(18), p.3403. https://doi.org/10.1080/00949655.2023.2226281.
Dimensions
PlumX
Visitas a la página del resumen del artículo
Descargas
Licencia
Derechos de autor 2017 Revista Colombiana de Estadística

Esta obra está bajo una licencia internacional Creative Commons Atribución 4.0.
- Los autores/as conservarán sus derechos de autor y garantizarán a la revista el derecho de primera publicación de su obra, el cuál estará simultáneamente sujeto a la Licencia de reconocimiento de Creative Commons (CC Atribución 4.0) que permite a terceros compartir la obra siempre que se indique su autor y su primera publicación esta revista.
- Los autores/as podrán adoptar otros acuerdos de licencia no exclusiva de distribución de la versión de la obra publicada (p. ej.: depositarla en un archivo telemático institucional o publicarla en un volumen monográfico) siempre que se indique la publicación inicial en esta revista.
- Se permite y recomienda a los autores/as difundir su obra a través de Internet (p. ej.: en archivos telemáticos institucionales o en su página web) antes y durante el proceso de envío, lo cual puede producir intercambios interesantes y aumentar las citas de la obra publicada. (Véase El efecto del acceso abierto).