Published
COMPARACIÓN ENTRE TRES TÉCNICAS DE CLASIFICACIÓN
COMPARISON FOR THREE CLASSIFICATION TECHNIQUES
Keywords:
regresión logística multinomial, análisis discriminante no métrico, análisis discriminante lineal, clasificación (es)Logistic regression, Nonparametric discriminant analysis, Multiple classification (en)
Downloads
1Universidad de São Paulo, Instituto de Matemática y Estadística, Departamento de estadística, São Paulo, Brasil. Estudiante de doctorado. Email: fhernanb@ime.usp.br
2Universidad Nacional de Colombia, Facultad de Ciencias, Departamento de Estadística, Medellín, Colombia. Profesor asociado. Email:jccorrea@unalmed.edu.co
En este artículo se muestran los resultados de un estudio de comparación mediante simulación de tres técnicas de clasificación, regresión logística multinomial (MLR), análisis discriminante no métrico (NDA) y análisis discriminante lineal (LDA). El desempeño de las técnicas se midió usando la tasa de clasificación errónea. Se encontró que las técnicas MLR y LDA tuvieron un desempeño similar y muy superior a NDA cuando la distribución multivariada de las poblaciones es normal o logit-normal; en el caso de distribuciones multivariadas log-normal y Sinh-1-normal la técnica MLR tuvo mejor desempeño.
Palabras clave: regresión logística multinomial, análisis discriminante no métrico, análisis discriminante lineal, clasificación.
In this paper we show the results of a comparison simulation study for three classification techniques: Multinomial Logistic Regression (MLR), No Metric Discriminant Analysis (NDA) and Linear Discriminant Analysis (LDA). The measure used to compare the performance of the three techniques was the Error Classification Rate (ECR). We found that MLR and LDA techniques have similar performance and that they are better than DNA when the population multivariate distribution is Normal or Logit-Normal. For the case of log-normal and Sinh-1-normal multivariate distributions we found that MLR had the better performance.
Key words: Logistic regression, Nonparametric discriminant analysis, Multiple classification.
Texto completo disponible en PDF
Referencias
1. Anderson, J. (1972), 'Separate Sample Logistic Discrimination', Biometrica 23, 19-35.
2. Carroll, R. & Pederson, S. (1993), 'On Robustness in the Logistic Regression Model', Journal of the Royal Statistical Society 55, 693-706.
3. Cheng, T., Pia, M. & Feser, V. (2002), 'High-Breakdown Estimation of Multivariate Mean and Covariance with Missing Observations', British Journal of Mathematical and Statistical Psychology 55, 317-335.
4. Choulakian, V. & Almhana, J. (2001), 'An Algorithm for Nonmetric Discriminant Analysis', Computational Statistics & Data Analysis 35, 253-264.
5. Clunies, C. & Riffenburgh, R. (1960), 'Geometry and Linear Discrimination', Biometrics 47, 185-189.
6. Cornfield, J. (1962), 'Joint Dependence of the Risk of Coronary Heart Disease on Serum Cholesterol and Systolic Blood Pressure: A Discriminant Function Analysis', Proceedings of the Federal American Society of Experimental Biology 21, 58-61.
7. Cox, D. (1966), Some Procedures Associated with the Logistic Qualitative Response Curve, John Wiley & Sons, New York, United States.
8. Crawley, D. (1979), 'Logistic Discrimination as an Alternative to Fisher's Linear Function', New Zealand Statistician 14, 21-25.
9. Croux, C. & Dehon, C. (2001), 'Robust Linear Discriminant Analysis Using S-Estimators', Canadian Journal of Statistics/Revue Canadienne de Statistique 29, 473-493.
10. Day, N. & Kerridge, D. (1967), 'A General Maximum Likelihood Discriminant', Biometrics 23, 313-323.
11. Efron, B. (1975), 'The Efficiency of Logistic Regression Compared to Normal Discriminant Analysis', Journal of American Statistical Association 70, 892-898.
12. Fisher, R. A. (1936), 'The Use of Multiple Measurements in Taxonomic Problems', Annual Eugenics 7, 179-188.
13. Guttman, L. (1998), 'Eta, disco, odisco and F.', Psychometrika 53, 393-405.
14. Hand, D. (1989), Discriminant Analysis for Psychiatric Screening, 2 edn, John Wiley & Sons, New York, United States.
15. Harrell, F. E. & Lee, K. L. (1985), A comparison of the discrimination of discriminant analysis and logistic regression under multivariate normality, 'Biostatistics: Statistics in Biomedical, Public Health and Environmental Sciences', North-Holland, New York, United States, p. 333-343.
16. Hawkins, D. & McLachan, J. (1997), 'High-Breakdown Linear Discriminant Analysis', Journal of American Statistical Asociation 92, 136-146.
17. Johnson, M. (1987), Multivariate Statistical Simulation, John Wiley & Sons, New York, United States.
18. Little, R. & Smith, P. (1987), 'Editing and Imputing for Quantitative Survey Data', Journal of the American Statistical Association 82, 58-68.
19. Morrison, D. (1990), Multivariate Statistical Methods, 3 edn, McGraw-Hill, New York, United States.
20. Pohar, M., Blas, M. & Turk, S. (2004), 'Comparison of Logistic Regression and Linear Discriminant Analysis: A Simulation Study', Metodolski Zvezki 1, 143-161.
21. Pregibon, D. (1981), 'Logistic Regression Diagnostics', The Annals of Statistics 9, 705-724.
22. R Development Core Team, (2008), R: A Language and Environment for Statistical Computing, R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0. *http://www.R-project.org [ Links ]
23. Rao, C. (1948), 'The Utilization of Multiple Measurements in Problems of Biological Classification', Journal of the Royal Statistical Society: Series B 10, 159-193.
24. Raveh, A. (1983), 'Preference Structure Analysis: A Nonmetric Approach', Patter Recognition 16, 253-259.
25. Raveh, A. (1989), 'A Nonmetric Approach to Linear Discriminant Analysis', Journal of the American Statistical Association 84, 176-183.
26. Rencher, A. (1998), Multivariate Statistical Inference and Applications, John Wiley & Sons, New York, United States.
27. Shelley, B. & Donner, A. (1987), 'The Efficiency of Multinomial Logistic Regression Compared with Multiple Group Discriminant Analysis', Journal of American Statistical Association 82, 1118-1122.
28. Trevor, F. & Ferry, G. (1991), 'Robust Logistic Discrimination', Biometrika 78, 841-849.
29. Welch, B. (1939), 'Note on Discriminant Functions', Biometrika 31, 218-220.
Este artículo se puede citar en LaTeX utilizando la siguiente referencia bibliográfica de BibTeX:
@ARTICLE{RCEv32n2a05,AUTHOR = {Hernández Barajas, Freddy and Correa Morales, Juan Carlos},
TITLE = {{Comparación entre tres técnicas de clasificación}},
JOURNAL = {Revista Colombiana de Estadística},
YEAR = {2009},
volume = {32},
number = {2},
pages = {247-265}
}
How to Cite
APA
ACM
ACS
ABNT
Chicago
Harvard
IEEE
MLA
Turabian
Vancouver
Download Citation
Article abstract page views
Downloads
License
Copyright (c) 2009 Revista Colombiana de Estadística

This work is licensed under a Creative Commons Attribution 4.0 International License.
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).