Published
Robust mixture regression based on the skew t distribution
Mixtura robusta de modelos de regresión basada en la distribución t asimétrica
DOI:
https://doi.org/10.15446/rce.v40n1.53580Keywords:
Mixture regression models, robust regression, maximum likelihood, EM algorithm, skew t distribution (en)Algoritmo EM, máxima verosimilitud, mixtura de regresiones, distribución t asimétrica. (es)
En este estudio se explora una mixtura robusta de modelos de regresión basada en la distribución t asimétrica, con el propósito de modelar colas pesadas o asimétricas en los errores, en un escenario de mixtura de regresiones. Se usa un algoritmo EM para obtener los estimadores máximo verosímiles empleando una mixtura de escala de la distribución t asimétrica. El comportamiento de los estimadores propuestos se ilustra a través de une estudio de simulación y de un ejemplo con datos reales.
1Giresun University, Faculty of Arts and Sciences, Department of Statistics, Giresun, Turkey. PhD. Email: fatma.dogru@giresun.edu.tr
2Ankara University, Faculty of Science, Department of Statistics, Ankara, Turkey. PhD. Email: oarslan@ankara.edu.tr
In this study, we explore a robust mixture regression procedure based on the skew t distribution in order to model heavy-tailed and/or skewed errors in a mixture regression setting. We present an EM-type algorithm to compute the maximum likelihood estimators for the parameters of interest using the scale mixture representation of the skew t distribution. The performance of proposed estimators is demonstrated by a simulation study and a real data example.
Key words: EM Algorithm, Maximum Likelihood, Mixture Regression Model, Skew t Distribution.
En este estudio se explora una mixtura robusta de modelos de regresión basada en la distribución t asimétrica, con el propósito de modelar colas pesadas o asimétricas en los errores, en un escenario de mixtura de regresiones. Se usa un algoritmo EM para obtener los estimadores máximo verosímiles empleando una mixtura de escala de la distribución t asimétrica. El comportamiento de los estimadores propuestos se ilustra a través de une estudio de simulación y de un ejemplo con datos reales.
Palabras clave: Algoritmo EM, máxima verosimilitud, mixtura de regresiones, distribución t asimétrica.
Texto completo disponible en PDF
References
1. Akaike, H. (1973), Information theory and an extension of the maximum likelihood principle, 'Proceeding of the Second International Symposium on Information Theory', Akademiai Kiado, Budapest, p. 267-281.
2. Azzalini, A. (1986), 'Further results on a class of distributions which includes the normal ones', Statistica 46, 199-208.
3. Azzalini, A. & Capitaino, A. (2003), 'Distributions generated by perturbation of symmetry with emphasis on a multivariate skew t distribution', Journal of the Royal Statistical Society: Series B 65, 367-389.
4. Bai, X. (2010), Robust mixture of regression models, Master's thesis, Kansas State University.
5. Bai, X., Yao, W. & Boyer, J. E. (2012), 'Robust fitting of mixture regression models', Computational Statistics and Data Analysis 56, 2347-2359.
6. Basford, K. E., Greenway, D. R., McLachlan, G. J. & Peel, D. (1997), 'Standard errors of fitted means under normal mixture', Computational Statistics 12, 1-17.
7. Bashir, S. & Carter, E. (2012), 'Robust mixture of linear regression models', Communications in Statistics-Theory and Methods 41, 3371-3388.
8. Bozdogan, H. (1993), Choosing the number of component clusters in the mixture model using a new informational complexity criterion of the inverse-fisher information matrix, 'Information and Classification', Springer Berlin Heidelberg, p. 40-54.
9. Cohen, A. C. (1984), 'Some effects of inharmonic partials on interval perception', Music Perception 1, 323-349.
10. Dempster, A. P., Laird, N. M. & Rubin, D. B. (1977), 'Maximum likelihood from incomplete data via the E-M algorithm', Journal of the Royal Statistical Society: Series B 39, 1-38.
11. Dias, J. G. & Wedel, M. (2004), 'An empirical comparison of em, sem and mcmc performance for problematic gaussian mixture likelihoods', Statistics and Computing 14, 323-332.
12. Dogru, F. Z. (2015), Robust Parameter Estimation in Mixture Regression Models, PhD thesis, Ankara University.
13. Dogru, F. Z. & Arslan, O. (2014), Robust mixture regression modelling based on the skew t distribution, 'International Conference on Robust Statistics (ICORS14)', Martin-Luther-University Halle-Wittenberg/Germany.
14. Gupta, A. (2003), 'Multivariate skew t distribution', Statistics 37, 359-363.
15. Gupta, A., Chang, F. & Huang, W. (2002), 'Some skew symmetric models', Random Operators Stochastic Equations 10, 133-140.
16. Henning, C. (2013), fpc: Flexible procedure for clustering. R Package Version 2.1-5.
17. Henze, N. (1986), 'A probabilistic representation of the skew-normal distribution', Scandinavian Journal of Statistics 13, 271-275.
18. Lange, K. L., Little, J. A. & Taylor, M. G. J. (1989), 'Robust statistical modeling using the t distribution', Journal of the American Statistical Association 84, 881-896.
19. Lin, T. I., Lee, J. C. & Hsieh, W. J. (2007), 'Robust mixture modeling using the skew t distribution', Statistics and Computing 17, 81-92.
20. Liu, M. & Lin, T. I. (2014), 'A skew-normal mixture regression model', Educational and Psychological Measurement 74(1), 139-162.
21. Lucas, A. (1997), 'Robustness of the student t based m-estimator', Communications in Statistics: Theory and Methods 26, 1165-1182.
22. Markatou, M. (2000), 'Mixture models, robustness, and the weighted likelihood methodology', Biometrics 56, 483-486.
23. Peel, D. & McLachlan, G. J. (2000), 'Robust mixture modelling using the t distribution', Statistics and Computing 10(4), 339-348.
24. Pereira, J. R., Marques, L. A. & da Costa, J. M. (2012), 'An empirical comparison of EM initialization methods and model choice criteria for mixtures of skew normal distributions', Revista Colombiana de Estadistica 35(3), 457-478.
25. Quandt, R. E. (1972), 'A new approach to estimating switching regressions', Journal of the American Statistical Association 67, 306-310.
26. Quandt, R. E. & Ramsey, J. B. (1978), 'Estimating mixtures of normal distributions and switching regressions', Journal of the American Statistical Association 73, 730-752.
27. Schwarz, G. (1978), 'Estimating the dimension of a model', Annals of Statistics 6(2), 461-464.
28. Shen, H., Yang, J. & Wang, S. (2004), Outlier detecting in fuzzy switching regression models, 'International Conference on Artificial Intelligence: Methodology, Systems, and Applications', Springer, , , p. 208-215.
29. Song, W., Yao, W. & Xing, Y. (2014), 'Robust mixture regression model fitting by laplace distribution', Computational Statistics and Data Analysis 71, 128-137.
30. Wei, Y. (2012), Robust mixture regression models using t-distribution, Master's thesis, Kansas State University.
31. Yao, W., Wei, Y. & Yu, C. (2014), 'Robust mixture regression using the t-distribution', Computational Statistics and Data Analysis 71, 116-127.
32. Zeller, C. B., Cabral, C. R. B. & Lachos, V. H. (2016), 'Robust mixture regression modeling based on scale mixtures of skew normal distributions', Test 25, 375-396.
33. Zhang, J. (2013), Robust mixture regression modeling with pearson type vii distribution, Master's thesis, Kansas State University.
Este artículo se puede citar en LaTeX utilizando la siguiente referencia bibliográfica de BibTeX:
@ARTICLE{RCEv40n1a09,
AUTHOR = Dogru, Fatma Zehra and Arslan, Olcay},
TITLE = {{Robust Mixture Regression Based on the Skew t Distribution}},
JOURNAL = {Revista Colombiana de Estadística},
YEAR = {2017},
volume = {40},
number = {1},
pages = {45-64}
}
References
Akaike, H. (1973), Information theory and an extension of the maximum likelihood principle, in B. N. Petrov & F. Caski, eds, ‘Proceeding of the Second International Symposium on Information Theory’, Akademiai Kiado, Budapest, pp. 267–281.
Azzalini, A. (1986), ‘Further results on a class of distributions which includes the normal ones’, Statistica 46, 199–208.
Azzalini, A. & Capitaino, A. (2003), ‘Distributions generated by perturbation of symmetry with emphasis on a multivariate skew t distribution’, Journal of the Royal Statistical Society: Series B 65, 367–389.
Bai, X. (2010), Robust mixture of regression models, Master’s thesis, Kansas State University.
Bai, X., Yao, W. & Boyer, J. E. (2012), ‘Robust fitting of mixture regression models’, Computational Statistics and Data Analysis 56, 2347–2359.
Basford, K. E., Greenway, D. R., McLachlan, G. J. & Peel, D. (1997), ‘Standard errors of fitted means under normal mixture’, Computational Statistics 12, 1– 17.
Bashir, S. & Carter, E. (2012), ‘Robust mixture of linear regression models’, Communications in Statistics-Theory and Methods 41, 3371–3388.
Bozdogan, H. (1993), Choosing the number of component clusters in the mixture model using a new informational complexity criterion of the inverse-fisher information matrix, in ‘Information and Classification’, Springer Berlin Heidelberg, pp. 40–54.
Cohen, A. C. (1984), ‘Some effects of inharmonic partials on interval perception’, Music Perception 1, 323–349.
Dempster, A. P., Laird, N. M. & Rubin, D. B. (1977), ‘Maximum likelihood from incomplete data via the E-M algorithm’, Journal of the Royal Statistical Society: Series B 39, 1–38.
Dias, J. G. & Wedel, M. (2004), ‘An empirical comparison of em, sem and mcmc performance for problematic gaussian mixture likelihoods’, Statistics and Computing 14, 323–332.
Dogru, F. Z. (2015), Robust Parameter Estimation in Mixture Regression Models, PhD thesis, Ankara University.
Dogru, F. Z. & Arslan, O. (2014), Robust mixture regression modelling based on the skew t distribution, in ‘International Conference on Robust Statistics (ICORS14)’, Martin-Luther-University Halle-Wittenberg/Germany.
Gupta, A. (2003), ‘Multivariate skew t distribution’, Statistics 37, 359–363.
Fatma Zehra Dogru & Olcay Arslan Gupta, A., Chang, F. & Huang, W. (2002), ‘Some skew symmetric models’, Random
Henning, C. (2013), fpc: Flexible procedure for clustering. R Package Version 2.1-5.
Henze, N. (1986), ‘A probabilistic representation of the skew-normal distribution’, Scandinavian Journal of Statistics 13, 271–275.
Lange, K. L., Little, J. A. & Taylor, M. G. J. (1989), ‘Robust statistical modeling using the t distribution’, Journal of the American Statistical Association 84, 881–896.
Lin, T. I., Lee, J. C. & Hsieh, W. J. (2007), ‘Robust mixture modeling using the skew t distribution’, Statistics and Computing 17, 81–92.
Liu, M. & Lin, T. I. (2014), ‘A skew-normal mixture regression model’, Educational and Psychological Measurement 74(1), 139–162.
Lucas, A. (1997), ‘Robustness of the student t based m-estimator’, Communications in Statistics: Theory and Methods 26, 1165–1182.
Markatou, M. (2000), ‘Mixture models, robustness, and the weighted likelihood methodology’, Biometrics 56, 483–486.
Peel, D. & McLachlan, G. J. (2000), ‘Robust mixture modelling using the t distribution’, Statistics and Computing 10(4), 339–348.
Pereira, J. R., Marques, L. A. & da Costa, J. M. (2012), ‘An empirical comparison of EM initialization methods and model choice criteria for mixtures of skew normal distributions’, Revista Colombiana de Estadistica 35(3), 457–478.
Quandt, R. E. (1972), ‘A new approach to estimating switching regressions’, Journal of the American Statistical Association 67, 306–310.
Quandt, R. E. & Ramsey, J. B. (1978), ‘Estimating mixtures of normal distributions and switching regressions’, Journal of the American Statistical Association 73, 730–752.
Schwarz, G. (1978), ‘Estimating the dimension of a model’, Annals of Statistics 6(2), 461–464.
Shen, H., Yang, J. & Wang, S. (2004), Outlier detecting in fuzzy switching regression models, in ‘International Conference on Artificial Intelligence: Methodology, Systems, and Applications’, Springer, pp. 208–215.
Song, W., Yao, W. & Xing, Y. (2014), ‘Robust mixture regression model fitting by laplace distribution’, Computational Statistics and Data Analysis 71, 128– 137.
Wei, Y. (2012), Robust mixture regression models using t-distribution, Master’s thesis, Kansas State University.
Yao, W., Wei, Y. & Yu, C. (2014), ‘Robust mixture regression using the tdistribution’, Computational Statistics and Data Analysis 71, 116–127.
Zeller, C. B., Cabral, C. R. B. & Lachos, V. H. (2016), ‘Robust mixture regression modeling based on scale mixtures of skew normal distributions’, Test 25, 375–396.
Zhang, J. (2013), Robust mixture regression modeling with pearson type vii distribution, Master’s thesis, Kansas State University.
How to Cite
APA
ACM
ACS
ABNT
Chicago
Harvard
IEEE
MLA
Turabian
Vancouver
Download Citation
CrossRef Cited-by
1. Haroon M. Barakat, Abdallh W. Aboutahoun, Naeema El-kadar. (2019). A New Extended Mixture Skew Normal Distribution, With Applications. Revista Colombiana de Estadística, 42(2), p.167. https://doi.org/10.15446/rce.v42n2.70087.
2. Fatma Zehra Doğru, Keming Yu, Olcay Arslan. (2019). Heteroscedastic and heavy-tailed regression with mixtures of skew Laplace normal distributions. Journal of Statistical Computation and Simulation, 89(17), p.3213. https://doi.org/10.1080/00949655.2019.1658111.
3. Chris Adcock, Adelchi Azzalini. (2020). A Selective Overview of Skew-Elliptical and Related Distributions and of Their Applications. Symmetry, 12(1), p.118. https://doi.org/10.3390/sym12010118.
4. Fatma Zehra Doğru, Olcay Arslan. (2021). Robust mixture regression modeling based on the generalized M (GM)-estimation method. Communications in Statistics - Simulation and Computation, 50(9), p.2643. https://doi.org/10.1080/03610918.2019.1610442.
5. H. M. Barakat, M. H. Dwes, Tuncer Acar. (2022). Limit Distributions of Ordered Random Variables in Mixture of Two Gaussian Sequences. Journal of Mathematics, 2022(1) https://doi.org/10.1155/2022/7956195.
6. Víctor Hugo Lachos Dávila, Celso Rômulo Barbosa Cabral, Camila Borelli Zeller. (2018). Finite Mixture of Skewed Distributions. SpringerBriefs in Statistics. , p.77. https://doi.org/10.1007/978-3-319-98029-4_6.
7. H. M. Barakat, M. H. Dwes. (2022). Asymptotic behavior of ordered random variables in mixture of two Gaussian sequences with random index. AIMS Mathematics, 7(10), p.19306. https://doi.org/10.3934/math.20221060.
8. Atefeh Zarei, Zahra Khodadadi, Mohsen Maleki, Karim Zare. (2023). Robust mixture regression modeling based on two-piece scale mixtures of normal distributions. Advances in Data Analysis and Classification, 17(1), p.181. https://doi.org/10.1007/s11634-022-00495-6.
9. Fatma Zehra Doğru, Olcay Arslan. (2017). Parameter estimation for mixtures of skew Laplace normal distributions and application in mixture regression modeling. Communications in Statistics - Theory and Methods, 46(21), p.10879. https://doi.org/10.1080/03610926.2016.1252400.
10. Weizhong Tian, Fengrong Wei, Thomas Brown. (2020). Mixture network autoregressive model with application on students’ successes. Frontiers of Mathematics in China, 15(1), p.141. https://doi.org/10.1007/s11464-020-0813-5.
11. Fatma Zehra Doğru, Olcay Arslan. (2023). Mixture regression modelling based on the shape mixtures of skew Laplace normal distribution. Journal of Statistical Computation and Simulation, 93(18), p.3403. https://doi.org/10.1080/00949655.2023.2226281.
Dimensions
PlumX
Article abstract page views
Downloads
License
Copyright (c) 2017 Revista Colombiana de Estadística

This work is licensed under a Creative Commons Attribution 4.0 International License.
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).