Published

2017-01-16

Robust mixture regression based on the skew t distribution

Mixtura robusta de modelos de regresión basada en la distribución t asimétrica

DOI:

https://doi.org/10.15446/rce.v40n1.53580

Keywords:

Mixture regression models, robust regression, maximum likelihood, EM algorithm, skew t distribution (en)
Algoritmo EM, máxima verosimilitud, mixtura de regresiones, distribución t asimétrica. (es)

Downloads

Authors

  • Fatma Zehra Doğru Ankara University
  • Olcay Arslan Ankara University
In this study, we propose a robust mixture regression procedure based on the skew t distribution to model heavy-tailed and/or skewed errors in a mixture regression setting. Using the scale mixture representation of the skew  t distribution, we give an Expectation Maximization (EM) algorithm to compute the maximum likelihood (ML) estimates for the paramaters of interest. The performance of proposed estimators is demonstrated by a simulation study and a real data example.

En este estudio se explora una mixtura robusta de modelos de regresión basada en la distribución t asimétrica, con el propósito de modelar colas pesadas o asimétricas en los errores, en un escenario de mixtura de regresiones. Se usa un algoritmo EM para obtener los estimadores máximo verosímiles empleando una mixtura de escala de la distribución t asimétrica. El comportamiento de los estimadores propuestos se ilustra a través de une estudio de simulación y de un ejemplo con datos reales.

Robust Mixture Regression Based on the Skew t Distribution

Mixtura robusta de modelos de regresión basada en la distribución t asimétrica

FATMA ZEHRA Doğru1, OLCAY ARSLAN2

1Giresun University, Faculty of Arts and Sciences, Department of Statistics, Giresun, Turkey. PhD. Email: fatma.dogru@giresun.edu.tr
2Ankara University, Faculty of Science, Department of Statistics, Ankara, Turkey. PhD. Email: oarslan@ankara.edu.tr


Abstract

In this study, we explore a robust mixture regression procedure based on the skew t distribution in order to model heavy-tailed and/or skewed errors in a mixture regression setting. We present an EM-type algorithm to compute the maximum likelihood estimators for the parameters of interest using the scale mixture representation of the skew t distribution. The performance of proposed estimators is demonstrated by a simulation study and a real data example.

Key words: EM Algorithm, Maximum Likelihood, Mixture Regression Model, Skew t Distribution.


Resumen

En este estudio se explora una mixtura robusta de modelos de regresión basada en la distribución t asimétrica, con el propósito de modelar colas pesadas o asimétricas en los errores, en un escenario de mixtura de regresiones. Se usa un algoritmo EM para obtener los estimadores máximo verosímiles empleando una mixtura de escala de la distribución t asimétrica. El comportamiento de los estimadores propuestos se ilustra a través de une estudio de simulación y de un ejemplo con datos reales.

Palabras clave: Algoritmo EM, máxima verosimilitud, mixtura de regresiones, distribución t asimétrica.


Texto completo disponible en PDF


References

1. Akaike, H. (1973), Information theory and an extension of the maximum likelihood principle, 'Proceeding of the Second International Symposium on Information Theory', Akademiai Kiado, Budapest, p. 267-281.

2. Azzalini, A. (1986), 'Further results on a class of distributions which includes the normal ones', Statistica 46, 199-208.

3. Azzalini, A. & Capitaino, A. (2003), 'Distributions generated by perturbation of symmetry with emphasis on a multivariate skew t distribution', Journal of the Royal Statistical Society: Series B 65, 367-389.

4. Bai, X. (2010), Robust mixture of regression models, Master's thesis, Kansas State University.

5. Bai, X., Yao, W. & Boyer, J. E. (2012), 'Robust fitting of mixture regression models', Computational Statistics and Data Analysis 56, 2347-2359.

6. Basford, K. E., Greenway, D. R., McLachlan, G. J. & Peel, D. (1997), 'Standard errors of fitted means under normal mixture', Computational Statistics 12, 1-17.

7. Bashir, S. & Carter, E. (2012), 'Robust mixture of linear regression models', Communications in Statistics-Theory and Methods 41, 3371-3388.

8. Bozdogan, H. (1993), Choosing the number of component clusters in the mixture model using a new informational complexity criterion of the inverse-fisher information matrix, 'Information and Classification', Springer Berlin Heidelberg, p. 40-54.

9. Cohen, A. C. (1984), 'Some effects of inharmonic partials on interval perception', Music Perception 1, 323-349.

10. Dempster, A. P., Laird, N. M. & Rubin, D. B. (1977), 'Maximum likelihood from incomplete data via the E-M algorithm', Journal of the Royal Statistical Society: Series B 39, 1-38.

11. Dias, J. G. & Wedel, M. (2004), 'An empirical comparison of em, sem and mcmc performance for problematic gaussian mixture likelihoods', Statistics and Computing 14, 323-332.

12. Dogru, F. Z. (2015), Robust Parameter Estimation in Mixture Regression Models, PhD thesis, Ankara University.

13. Dogru, F. Z. & Arslan, O. (2014), Robust mixture regression modelling based on the skew t distribution, 'International Conference on Robust Statistics (ICORS14)', Martin-Luther-University Halle-Wittenberg/Germany.

14. Gupta, A. (2003), 'Multivariate skew t distribution', Statistics 37, 359-363.

15. Gupta, A., Chang, F. & Huang, W. (2002), 'Some skew symmetric models', Random Operators Stochastic Equations 10, 133-140.

16. Henning, C. (2013), fpc: Flexible procedure for clustering. R Package Version 2.1-5.

17. Henze, N. (1986), 'A probabilistic representation of the skew-normal distribution', Scandinavian Journal of Statistics 13, 271-275.

18. Lange, K. L., Little, J. A. & Taylor, M. G. J. (1989), 'Robust statistical modeling using the t distribution', Journal of the American Statistical Association 84, 881-896.

19. Lin, T. I., Lee, J. C. & Hsieh, W. J. (2007), 'Robust mixture modeling using the skew t distribution', Statistics and Computing 17, 81-92.

20. Liu, M. & Lin, T. I. (2014), 'A skew-normal mixture regression model', Educational and Psychological Measurement 74(1), 139-162.

21. Lucas, A. (1997), 'Robustness of the student t based m-estimator', Communications in Statistics: Theory and Methods 26, 1165-1182.

22. Markatou, M. (2000), 'Mixture models, robustness, and the weighted likelihood methodology', Biometrics 56, 483-486.

23. Peel, D. & McLachlan, G. J. (2000), 'Robust mixture modelling using the t distribution', Statistics and Computing 10(4), 339-348.

24. Pereira, J. R., Marques, L. A. & da Costa, J. M. (2012), 'An empirical comparison of EM initialization methods and model choice criteria for mixtures of skew normal distributions', Revista Colombiana de Estadistica 35(3), 457-478.

25. Quandt, R. E. (1972), 'A new approach to estimating switching regressions', Journal of the American Statistical Association 67, 306-310.

26. Quandt, R. E. & Ramsey, J. B. (1978), 'Estimating mixtures of normal distributions and switching regressions', Journal of the American Statistical Association 73, 730-752.

27. Schwarz, G. (1978), 'Estimating the dimension of a model', Annals of Statistics 6(2), 461-464.

28. Shen, H., Yang, J. & Wang, S. (2004), Outlier detecting in fuzzy switching regression models, 'International Conference on Artificial Intelligence: Methodology, Systems, and Applications', Springer, , , p. 208-215.

29. Song, W., Yao, W. & Xing, Y. (2014), 'Robust mixture regression model fitting by laplace distribution', Computational Statistics and Data Analysis 71, 128-137.

30. Wei, Y. (2012), Robust mixture regression models using t-distribution, Master's thesis, Kansas State University.

31. Yao, W., Wei, Y. & Yu, C. (2014), 'Robust mixture regression using the t-distribution', Computational Statistics and Data Analysis 71, 116-127.

32. Zeller, C. B., Cabral, C. R. B. & Lachos, V. H. (2016), 'Robust mixture regression modeling based on scale mixtures of skew normal distributions', Test 25, 375-396.

33. Zhang, J. (2013), Robust mixture regression modeling with pearson type vii distribution, Master's thesis, Kansas State University.


[Recibido en abril de 2015. Aceptado en febrero de 2016]

Este artículo se puede citar en LaTeX utilizando la siguiente referencia bibliográfica de BibTeX:

@ARTICLE{RCEv40n1a09,
    AUTHOR  = Dogru, Fatma Zehra and Arslan, Olcay},
    TITLE   = {{Robust Mixture Regression Based on the Skew t Distribution}},
    JOURNAL = {Revista Colombiana de Estadística},
    YEAR    = {2017},
    volume  = {40},
    number  = {1},
    pages   = {45-64}
}

References

Akaike, H. (1973), Information theory and an extension of the maximum likelihood principle, in B. N. Petrov & F. Caski, eds, ‘Proceeding of the Second International Symposium on Information Theory’, Akademiai Kiado, Budapest, pp. 267–281.

Azzalini, A. (1986), ‘Further results on a class of distributions which includes the normal ones’, Statistica 46, 199–208.

Azzalini, A. & Capitaino, A. (2003), ‘Distributions generated by perturbation of symmetry with emphasis on a multivariate skew t distribution’, Journal of the Royal Statistical Society: Series B 65, 367–389.

Bai, X. (2010), Robust mixture of regression models, Master’s thesis, Kansas State University.

Bai, X., Yao, W. & Boyer, J. E. (2012), ‘Robust fitting of mixture regression models’, Computational Statistics and Data Analysis 56, 2347–2359.

Basford, K. E., Greenway, D. R., McLachlan, G. J. & Peel, D. (1997), ‘Standard errors of fitted means under normal mixture’, Computational Statistics 12, 1– 17.

Bashir, S. & Carter, E. (2012), ‘Robust mixture of linear regression models’, Communications in Statistics-Theory and Methods 41, 3371–3388.

Bozdogan, H. (1993), Choosing the number of component clusters in the mixture model using a new informational complexity criterion of the inverse-fisher information matrix, in ‘Information and Classification’, Springer Berlin Heidelberg, pp. 40–54.

Cohen, A. C. (1984), ‘Some effects of inharmonic partials on interval perception’, Music Perception 1, 323–349.

Dempster, A. P., Laird, N. M. & Rubin, D. B. (1977), ‘Maximum likelihood from incomplete data via the E-M algorithm’, Journal of the Royal Statistical Society: Series B 39, 1–38.

Dias, J. G. & Wedel, M. (2004), ‘An empirical comparison of em, sem and mcmc performance for problematic gaussian mixture likelihoods’, Statistics and Computing 14, 323–332.

Dogru, F. Z. (2015), Robust Parameter Estimation in Mixture Regression Models, PhD thesis, Ankara University.

Dogru, F. Z. & Arslan, O. (2014), Robust mixture regression modelling based on the skew t distribution, in ‘International Conference on Robust Statistics (ICORS14)’, Martin-Luther-University Halle-Wittenberg/Germany.

Gupta, A. (2003), ‘Multivariate skew t distribution’, Statistics 37, 359–363.

Fatma Zehra Dogru & Olcay Arslan Gupta, A., Chang, F. & Huang, W. (2002), ‘Some skew symmetric models’, Random

Henning, C. (2013), fpc: Flexible procedure for clustering. R Package Version 2.1-5.

Henze, N. (1986), ‘A probabilistic representation of the skew-normal distribution’, Scandinavian Journal of Statistics 13, 271–275.

Lange, K. L., Little, J. A. & Taylor, M. G. J. (1989), ‘Robust statistical modeling using the t distribution’, Journal of the American Statistical Association 84, 881–896.

Lin, T. I., Lee, J. C. & Hsieh, W. J. (2007), ‘Robust mixture modeling using the skew t distribution’, Statistics and Computing 17, 81–92.

Liu, M. & Lin, T. I. (2014), ‘A skew-normal mixture regression model’, Educational and Psychological Measurement 74(1), 139–162.

Lucas, A. (1997), ‘Robustness of the student t based m-estimator’, Communications in Statistics: Theory and Methods 26, 1165–1182.

Markatou, M. (2000), ‘Mixture models, robustness, and the weighted likelihood methodology’, Biometrics 56, 483–486.

Peel, D. & McLachlan, G. J. (2000), ‘Robust mixture modelling using the t distribution’, Statistics and Computing 10(4), 339–348.

Pereira, J. R., Marques, L. A. & da Costa, J. M. (2012), ‘An empirical comparison of EM initialization methods and model choice criteria for mixtures of skew normal distributions’, Revista Colombiana de Estadistica 35(3), 457–478.

Quandt, R. E. (1972), ‘A new approach to estimating switching regressions’, Journal of the American Statistical Association 67, 306–310.

Quandt, R. E. & Ramsey, J. B. (1978), ‘Estimating mixtures of normal distributions and switching regressions’, Journal of the American Statistical Association 73, 730–752.

Schwarz, G. (1978), ‘Estimating the dimension of a model’, Annals of Statistics 6(2), 461–464.

Shen, H., Yang, J. & Wang, S. (2004), Outlier detecting in fuzzy switching regression models, in ‘International Conference on Artificial Intelligence: Methodology, Systems, and Applications’, Springer, pp. 208–215.

Song, W., Yao, W. & Xing, Y. (2014), ‘Robust mixture regression model fitting by laplace distribution’, Computational Statistics and Data Analysis 71, 128– 137.

Wei, Y. (2012), Robust mixture regression models using t-distribution, Master’s thesis, Kansas State University.

Yao, W., Wei, Y. & Yu, C. (2014), ‘Robust mixture regression using the tdistribution’, Computational Statistics and Data Analysis 71, 116–127.

Zeller, C. B., Cabral, C. R. B. & Lachos, V. H. (2016), ‘Robust mixture regression modeling based on scale mixtures of skew normal distributions’, Test 25, 375–396.

Zhang, J. (2013), Robust mixture regression modeling with pearson type vii distribution, Master’s thesis, Kansas State University.

How to Cite

APA

Doğru, F. Z. and Arslan, O. (2017). Robust mixture regression based on the skew t distribution. Revista Colombiana de Estadística, 40(1), 45–64. https://doi.org/10.15446/rce.v40n1.53580

ACM

[1]
Doğru, F.Z. and Arslan, O. 2017. Robust mixture regression based on the skew t distribution. Revista Colombiana de Estadística. 40, 1 (Jan. 2017), 45–64. DOI:https://doi.org/10.15446/rce.v40n1.53580.

ACS

(1)
Doğru, F. Z.; Arslan, O. Robust mixture regression based on the skew t distribution. Rev. colomb. estad. 2017, 40, 45-64.

ABNT

DOĞRU, F. Z.; ARSLAN, O. Robust mixture regression based on the skew t distribution. Revista Colombiana de Estadística, [S. l.], v. 40, n. 1, p. 45–64, 2017. DOI: 10.15446/rce.v40n1.53580. Disponível em: https://revistas.unal.edu.co/index.php/estad/article/view/53580. Acesso em: 4 aug. 2024.

Chicago

Doğru, Fatma Zehra, and Olcay Arslan. 2017. “Robust mixture regression based on the skew t distribution”. Revista Colombiana De Estadística 40 (1):45-64. https://doi.org/10.15446/rce.v40n1.53580.

Harvard

Doğru, F. Z. and Arslan, O. (2017) “Robust mixture regression based on the skew t distribution”, Revista Colombiana de Estadística, 40(1), pp. 45–64. doi: 10.15446/rce.v40n1.53580.

IEEE

[1]
F. Z. Doğru and O. Arslan, “Robust mixture regression based on the skew t distribution”, Rev. colomb. estad., vol. 40, no. 1, pp. 45–64, Jan. 2017.

MLA

Doğru, F. Z., and O. Arslan. “Robust mixture regression based on the skew t distribution”. Revista Colombiana de Estadística, vol. 40, no. 1, Jan. 2017, pp. 45-64, doi:10.15446/rce.v40n1.53580.

Turabian

Doğru, Fatma Zehra, and Olcay Arslan. “Robust mixture regression based on the skew t distribution”. Revista Colombiana de Estadística 40, no. 1 (January 1, 2017): 45–64. Accessed August 4, 2024. https://revistas.unal.edu.co/index.php/estad/article/view/53580.

Vancouver

1.
Doğru FZ, Arslan O. Robust mixture regression based on the skew t distribution. Rev. colomb. estad. [Internet]. 2017 Jan. 1 [cited 2024 Aug. 4];40(1):45-64. Available from: https://revistas.unal.edu.co/index.php/estad/article/view/53580

Download Citation

CrossRef Cited-by

CrossRef citations11

1. Haroon M. Barakat, Abdallh W. Aboutahoun, Naeema El-kadar. (2019). A New Extended Mixture Skew Normal Distribution, With Applications. Revista Colombiana de Estadística, 42(2), p.167. https://doi.org/10.15446/rce.v42n2.70087.

2. Fatma Zehra Doğru, Keming Yu, Olcay Arslan. (2019). Heteroscedastic and heavy-tailed regression with mixtures of skew Laplace normal distributions. Journal of Statistical Computation and Simulation, 89(17), p.3213. https://doi.org/10.1080/00949655.2019.1658111.

3. Chris Adcock, Adelchi Azzalini. (2020). A Selective Overview of Skew-Elliptical and Related Distributions and of Their Applications. Symmetry, 12(1), p.118. https://doi.org/10.3390/sym12010118.

4. Fatma Zehra Doğru, Olcay Arslan. (2021). Robust mixture regression modeling based on the generalized M (GM)-estimation method. Communications in Statistics - Simulation and Computation, 50(9), p.2643. https://doi.org/10.1080/03610918.2019.1610442.

5. H. M. Barakat, M. H. Dwes, Tuncer Acar. (2022). Limit Distributions of Ordered Random Variables in Mixture of Two Gaussian Sequences. Journal of Mathematics, 2022(1) https://doi.org/10.1155/2022/7956195.

6. Víctor Hugo Lachos Dávila, Celso Rômulo Barbosa Cabral, Camila Borelli Zeller. (2018). Finite Mixture of Skewed Distributions. SpringerBriefs in Statistics. , p.77. https://doi.org/10.1007/978-3-319-98029-4_6.

7. H. M. Barakat, M. H. Dwes. (2022). Asymptotic behavior of ordered random variables in mixture of two Gaussian sequences with random index. AIMS Mathematics, 7(10), p.19306. https://doi.org/10.3934/math.20221060.

8. Atefeh Zarei, Zahra Khodadadi, Mohsen Maleki, Karim Zare. (2023). Robust mixture regression modeling based on two-piece scale mixtures of normal distributions. Advances in Data Analysis and Classification, 17(1), p.181. https://doi.org/10.1007/s11634-022-00495-6.

9. Fatma Zehra Doğru, Olcay Arslan. (2017). Parameter estimation for mixtures of skew Laplace normal distributions and application in mixture regression modeling. Communications in Statistics - Theory and Methods, 46(21), p.10879. https://doi.org/10.1080/03610926.2016.1252400.

10. Weizhong Tian, Fengrong Wei, Thomas Brown. (2020). Mixture network autoregressive model with application on students’ successes. Frontiers of Mathematics in China, 15(1), p.141. https://doi.org/10.1007/s11464-020-0813-5.

11. Fatma Zehra Doğru, Olcay Arslan. (2023). Mixture regression modelling based on the shape mixtures of skew Laplace normal distribution. Journal of Statistical Computation and Simulation, 93(18), p.3403. https://doi.org/10.1080/00949655.2023.2226281.

Dimensions

PlumX

Article abstract page views

582

Downloads

Download data is not yet available.