Published
Combining Some Biased Estimation Methods with Least Trimmed Squares Regression and its Application
Combinación de algunos métodos de estimación sesgados con regresión de mínimos cuadrados recortados y su aplicación
DOI:
https://doi.org/10.15446/rce.v38n2.51675Keywords:
Biased Estimator, Least Trimmed Squares, Robust Estimation (en)Estimadores sesgados, Mínimos cuadrados recortados, Robusta estimación. (es)
In the case of multicollinearity and outliers in regression analysis, the researchers are encouraged to deal with two problems simultaneously. Biased methods based on robust estimators are useful for estimating the regression coefficients for such cases. In this study we examine some robust biased estimators on the datasets with outliers in x direction and outliers in both x and y direction from literature by means of the R package ltsbase. Instead of a complete data analysis, robust biased estimators are evaluated using capabilities and features of this package.
En el caso de multicolinealidad y outliers en análisis de regresión, los investigadores se enfrentan a tener que tratar dos problemas de manera simultánea. Métodos sesgados basados en estimadores robustos son útiles para estimar los coeficientes de regresión en estos casos. En este estudio se examinan algunos estimadores sesgados robustos en conjuntos de datos con outliers en x y outliers tanto en x como en y por medio del paquete ltsbase de R. En lugar de un análisis de datos completos, los estimadores sesgados robustos son evaluados usando las capacidades y características de este paquete.
https://doi.org/10.15446/rce.v38n2.51675
1Anadolu University, Science Faculty, Department of Statistics, Eskisehir, Turkey. Professor. Email: bkan@anadolu.edu.tr
2Eskisehir Osmangazi University, Faculty of Arts and Sciences, Department of Statistics, Eskisehir, Turkey. Professor. Email: oalpu@ogu.edu.tr
In the case of multicollinearity and outliers in regression analysis, the researchers are encouraged to deal with two problems simultaneously. Biased methods based on robust estimators are useful for estimating the regression coefficients for such cases. In this study we examine some robust biased estimators on the datasets with outliers in x direction and outliers in both x and y direction from literature by means of the R package ltsbase . Instead of a complete data analysis, robust biased estimators are evaluated using capabilities and features of this package.
Key words: Biased Estimator, Least Trimmed Squares, Robust Estimation.
En el caso de multicolinealidad y outliers en análisis de regresión, los investigadores se enfrentan a tener que tratar dos problemas de manera simultánea. Métodos sesgados basados en estimadores robustos son útiles para estimar los coeficientes de regresión en estos casos. En este estudio se examinan algunos estimadores sesgados robustos en conjuntos de datos con outliers en x y outliers tanto en x como en y por medio del paquete ltsbase de R. En lugar de un análisis de datos completos, los estimadores sesgados robustos son evaluados usando las capacidades y características de este paquete.
Palabras clave: estimadores sesgados, mínimos cuadrados recortados, robusta estimación.
Texto completo disponible en PDF
References
1. Agullo, J. (2001), 'New algorithms for computing the least trimmed squares regression estimator', Computational Statistics and Data Analysis 36(4), 425-439.
2. Alfons, A. (2013), sparseLTSEigen: RcppEigen back end for sparse least trimmed squares regression. R package. *https://www.r-project.org/
3. Atkinson, A. & Weisberg, S. (1991), Simulated annealing for the detection of multiple outliers using least squares and least median of squares fitting, 'Directions in Robust Statistics and Diagnostics', Springer-Verlag, New York.
4. Belsley, D. (1991), Conditioning Diagnostics: Collinearity and Weak Data in Regression, 1 edn, John Wiley & Sons, New York.
5. Chatterjee, S. & Hadi, A. S. (2006), Regression Analysis by Examples, 4 edn, John Wiley & Sons, New York.
6. Cizek, P. (2005), 'Least trimmed squares in nonlinear regression under dependence', Journal of Statistical Planning and Inference 136(11), 3967-3988.
7. Fox, J. & Weisberg, S. (2011), An R Companion to Applied Regression, 2 edn, Sage, Thousand Oaks, California.
8. Gujarati, D. (2004), Basic Econometrics, 4 edn, McGraw-Hill.
9. Hawkins, D. (1994), 'The feasible solution algorithm for least trimmed squares regression', Computational Statistics and Data Analysis 17(2), 185-196.
10. Hawkins, D., Bradu, M. & Kass, G. (1984), 'Location of several outliers in multiple regression data using elemental sets', Technometrics 26(3), 97-208.
11. Hawkins, D. & Olive, D. (2002), 'Inconsistency of resampling algorithms for high-breakdown regression estimators and a new algorithm', Journal of the American Statistical Association 97(457), 136-159.
12. Heumann, C., Shalabh,, Rao, C. & Toutenburg, H. (2008), Linear Models and Generalizations- Least Squares and Alternatives, 3 edn, Springer, New York.
13. Hoerl, K. & Kennard, R. (1970), 'Ridge regression: Biased estimation for nonorthogonal problems', Technometrics 12(1), 55-67.
14. Hossjer, O. (1995), 'Exact computation of the least trimmed squares estimate in simple linear regression', Computational Statistics and Data Analysis 19(3), 265-282.
15. Jung, K. (2005), 'Multivariate least-trimmed squares regression estimator', Computational Statistics and Data Analysis 48(2), 307-316.
16. Kan Kilinc B., & Alpu O., (2013), ltsbase: Ridge and Liu Estimates based on LTS Method. R package version 101. *http://CRAN.R-project.org/packageltsbase
17. Kan, B., Alpu, O. & Yazici, B. (2013), 'Robust ridge and robust Liu estimator for regression based on the LTS estimator', Journal of Applied Statistics 40(3), 644-655.
18. Li, L. (2005), 'An algorithm for computing exact least-trimmed squares estimate of simple linear regression with constraints', Computational Statistics and Data Analysis 48(4), 717-734.
19. Liu, K. (1993), 'A new class of biased estimate in linear regression', Communications in Statistics-Theory and Methods 22(2), 393-402.
20. Maguna, F. P., Nunez, M. B., Okulik, N. & Castro, E. A. (2003), 'Improved QSAR analysis of the toxicity of aliphatic carboxylic acids', Russian Journal of General Chemistry 73(11), 1792-1798.
21. Marquardt, D. & Snee, R. (1975), 'Ridge regression in practice', The American Statistician 29(1), 3-20.
22. Mason, R. & Gunst, R. (1985), 'Outlier-induced collinearities', Technometrics 27(4), 401-407.
23. Neykov, N. & Neytchev, P. (1991), 'Least median of squares, least trimmed squares and S estimations by means of BMDP3R and BMDPAR', Computational Statistics Quarterly 4, 281-293.
24. R Development Core Team, (2013), R: A Language and Environment for Statistical Computing, R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0. *https://www.r-project.org/
25. Rousseeuw, P. J. & van Driessen, K. (1999), 'A fast algorithm for the minimum covariance determinant estimator', Technometrics 41(3), 212-223.
26. Rousseeuw, P., C. Croux,, C. Todorov,, A. Ruckstuhl,, M. Salibian-Barrera,, T. Verbeker,, M. Koller, & Maechler, M. (2012), robustbase: Basic Robust Statistics. R package version 0.9-8. *https://www.r-project.org/
27. Rousseeuw, P. & Leroy, A. (1987), Robust Regression and Outlier Detection , John Wiley & Sons, New York.
28. Rousseeuw, P. & van Driessen, K. (2006), 'Computing LTS regression for large data sets', Data Mining and Knowledge Discovery 12(1), 29-45.
29. Ruppert, D. (1992), 'Computing S estimators for regression and multivariate location/dispersion', Journal of Computational and Graphical Statistics 1(3), 253-270.
30. Ruppert, D. & Carrol, R. (1980), 'Trimmed least squares estimation in the linear model', Journal of the American Statistical Association 75, 828-838.
31. Stromberg, A. (1993), 'Computing the exact least median of squares estimate and stability diagnostics in multiple linear regression', SIAM Journal on Scientific Computing 14(6), 1289-1299.
32. Tichavsky, P. (1991), 'Algorithms for and geometrical characterization of solutions in the LMS and the LTS linear regression', Computational Statistics Quarterly 6(2), 139-151.
33. Venables, W. & Ripley, B. (2002), Modern Applied Statistics with S, 4 edn, Springer, New York.
34. Willems, G. & van Aelst, S. (2005), 'Fast and robust bootstrap for LTS', Computational Statistics and Data Analysis 48(4), 703-715.
35. Wissmann, M., Toutenburg, H. & Shalabh, (2007), Role of Categorical Variables in Multicollinearity in Linear Regression Model, Technical Report Department of Statistics University of Munich, Germany.
Este artículo se puede citar en LaTeX utilizando la siguiente referencia bibliográfica de BibTeX:
@ARTICLE{RCEv38n2a11,
AUTHOR = {Kan-Kilinç, Betül and Alpu, Ozlem},
TITLE = {{Combining some Biased Estimation Methods with Least Trimmed Squares Regression and its Application}},
JOURNAL = {Revista Colombiana de Estadística},
YEAR = {2015},
volume = {38},
number = {2},
pages = {385-402}
}
References
Agullo, J. (2001), ‘New algorithms for computing the least trimmed squares regression estimator’, Computational Statistics and Data Analysis 36(4), 425–439.
Alfons, A. (2013), sparseLTSEigen: RcppEigen back end for sparse least trimmed squares regression, R package version 0.2.0.
*http://CRAN.R-project.org/package=sparseLTSEigen
Atkinson, A. & Weisberg, S. (1991), Simulated annealing for the detection of multiple outliers using least squares and least median of squares fitting, in W. Stahel & S. Weisberg, eds, ‘Directions in Robust Statistics and Diagnostics’, Springer-Verlag, New York.
Belsley, D. (1991), Conditioning Diagnostics: Collinearity and Weak Data in Regression, 1 edn, John Wiley & Sons, New York.
Chatterjee, S. & Hadi, A. S. (2006), Regression Analysis by Examples, 4 edn, John Wiley & Sons, New York.
Cizek, P. (2005), ‘Least trimmed squares in nonlinear regression under dependence’, Journal of Statistical Planning and Inference 136(11), 3967–3988.
Fox, J. & Weisberg, S. (2011), An R Companion to Applied Regression, 2 edn, Sage, Thousand Oaks, California.
Gujarati, D. (2004), Basic Econometrics, 4 edn, McGraw-Hill. Hawkins, D. (1994), ‘The feasible solution algorithm for least trimmed squares regression’, Computational Statistics and Data Analysis 17(2), 185–196.
Hawkins, D., Bradu, M. & Kass, G. (1984), ‘Location of several outliers in multiple regression data using elemental sets’, Technometrics 26(3), 97–208.
Hawkins, D. & Olive, D. (2002), ‘Inconsistency of resampling algorithms for highbreakdown regression estimators and a new algorithm’, Journal of the American Statistical Association 97(457), 136–159.
Heumann, C., Shalabh, Rao, C. & Toutenburg, H. (2008), Linear Models and Generalizations- Least Squares and Alternatives, 3 edn, Springer, New York.
Hoerl, K. & Kennard, R. (1970), ‘Ridge regression: Biased estimation for nonorthogonal problems’, Technometrics 12(1), 55–67.
Hossjer, O. (1995), ‘Exact computation of the least trimmed squares estimate in simple linear regression’, Computational Statistics and Data Analysis 19(3), 265–282.
Jung, K. (2005), ‘Multivariate least-trimmed squares regression estimator’, Computational Statistics and Data Analysis 48(2), 307–316.
Kan, B., Alpu, O. & Yazici, B. (2013), ‘Robust ridge and robust Liu estimator for regression based on the LTS estimator’, Journal of Applied Statistics 40(3), 644–655.
Kan Kilinc B. and Alpu O. (2013), ltsbase: Ridge and Liu Estimates based on LTS Method, R package version 101.
*http://CRAN.R-project.org/package=ltsbase
Li, L. (2005), ‘An algorithm for computing exact least-trimmed squares estimate of simple linear regression with constraints’, Computational Statistics and Data Analysis 48(4), 717–734.
Liu, K. (1993), ‘A new class of biased estimate in linear regression’, Communications in Statistics-Theory and Methods 22(2), 393–402.
Maguna, F. P., Nunez, M. B., Okulik, N. & Castro, E. A. (2003), ‘Improved QSAR analysis of the toxicity of aliphatic carboxylic acids’, Russian Journal of General Chemistry 73(11), 1792–1798.
Marquardt, D. & Snee, R. (1975), ‘Ridge regression in practice’, The American Statistician 29(1), 3–20.
Mason, R. & Gunst, R. (1985), ‘Outlier-induced collinearities’, Technometrics 27(4), 401–407.
Neykov, N. & Neytchev, P. (1991), ‘Least median of squares, least trimmed squares and S estimations by means of BMDP3R and BMDPAR’, Computational Statistics Quarterly 4, 281–293.
R Development Core Team (2013), R: A Language and Environment for Statistical Computing, R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0.
Rousseeuw, P. J. & van Driessen, K. (1999), ‘A fast algorithm for the mínimum covariance determinant estimator’, Technometrics 41(3), 212–223.
Rousseeuw, P. & Leroy, A. (1987), Robust Regression and Outlier Detection , John Wiley & Sons, New York.
Rousseeuw, P. & van Driessen, K. (2006), ‘Computing LTS regression for large data sets’, Data Mining and Knowledge Discovery 12(1), 29–45.
Rousseeuw, P.J. and Croux, C. and Todorov, C. and Ruckstuhl, A. and Salibian- Barrera, M. and Verbeker, T. and Koller, M. and Maechler, M. (2012), robustbase: Basic Robust Statistics, R package version 0.9-8.
*http://CRAN.R-project.org/package=robustbase
Ruppert, D. (1992), ‘Computing S estimators for regression and multivariate location/dispersion’, Journal of Computational and Graphical Statistics 1(3), 253–270.
Ruppert, D. & Carrol, R. (1980), ‘Trimmed least squares estimation in the linear model’, Journal of the American Statistical Association 75, 828–838.
Stromberg, A. (1993), ‘Computing the exact least median of squares estimate and stability diagnostics in multiple linear regression’, SIAM Journal on Scientific Computing 14(6), 1289–1299.
Tichavsky, P. (1991), ‘Algorithms for and geometrical characterization of solutions in the LMS and the LTS linear regression’, Computational Statistics Quarterly 6(2), 139–151.
Venables, W. & Ripley, B. (2002), Modern Applied Statistics with S, 4 edn, Springer, New York.
Willems, G. & van Aelst, S. (2005), ‘Fast and robust bootstrap for LTS’, Computational Statistics and Data Analysis 48(4), 703–715.
Wissmann, M., Toutenburg, H. & Shalabh (2007), Role of Categorical Variables in Multicollinearity in Linear Regression Model, Technical Report Department of Statistics University of Munich, Germany.
How to Cite
APA
ACM
ACS
ABNT
Chicago
Harvard
IEEE
MLA
Turabian
Vancouver
Download Citation
CrossRef Cited-by
1. M. Hošek, J. Bednárek, J. Popelka, J. Elznicová, Š. Tůmová, J. Rohovec, T. Navrátil, T. Matys Grygar. (2020). Persistent mercury hot spot in Central Europe and Skalka Dam reservoir as a long-term mercury trap. Environmental Geochemistry and Health, 42(5), p.1273. https://doi.org/10.1007/s10653-019-00408-1.
Dimensions
PlumX
Article abstract page views
Downloads
License
Copyright (c) 2015 Revista Colombiana de Estadística

This work is licensed under a Creative Commons Attribution 4.0 International License.
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).