Published
PLS Generalized Linear Regression and Kernel Multilogit Algorithm (KMA) for Microarray Data Classification Problem
Regresión lineal generalizada por MCP y algoritmo kernel multilogit para la clasificación de datos de microarreglos
DOI:
https://doi.org/10.15446/rce.v43n2.81811Keywords:
Generalized linear regression, Kernel multilogit algorithm, Partial least squares (en)Regresíon lineal generalizada, Algoritmo de kernel multilogit, Mínimos cuadrados parciales (es)
Downloads
Este estudio combina el modelo de regresión lineal generalizado por mínimos cuadrado parciales (RLGMCP), con regresión logística y análisis discriminante lineal, para obtener los modelos de regresión logística generalizada por mínimos cuadrados parciales, (RLGMCP) y regresión logística generalizada-discriminante por mínimos cuadrados parciales (RLGDMCP). Se realiza un estudio comparativo con clasificadores clásicos como, k-vecinos más cercanos (KVC), análisis discriminante lineal (ADL), análisis discriminante de por mínimos cuadrados parciales (ADMCP), regresión por mínimos cuadrados parciales (RMCP) y máquinas de vectores de soporte de soporte vectorial (MSV). Además, se implementa una nueva metodología conocida como algoritmo de kernel multilogit (AKM). Su desempeño es comparado con los de los otros clasificadores. De acuerdo con las tasas de error de clasificación obtenidas a partir de los diferentes tipos de datos, el KMA es el de mejor resultado.
References
Alon, U., Barkai, N., Notterman, D. A., Gish, K., Ybarra, S., Mack, D. & Levine, A. J. (1999), Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays, Proceedings of the National Academy of Sciences of the United States of America 96(12), 6745–6750.
Alshamlan, H. M., Badr, G. & Alohali, Y. (2013), A study of cancer microarray gene expression profile: Objectives and approaches, in Proceedings of the World Congress on Engineering, Vol. II, London.
Awada, W., Khoshgoftaar, T. M., Dittman, D., Wald, R. & Napolitano, A. (2012), A review of the stability of feature selection techniques for bioinformatics data, in 2012 IEEE 13th International Conference on Information Reuse & Integration (IRI), IEEE, pp. 356–363.
Bastien, P., Vinzi, E. V. & Tenenhaus, M. (2005), PLS generalised linear regression, Computational Statistics and Data Analysis 48, 17–46.
Boulesteix, A. L., Strobl, C., Augustin, T. & Daumer, M. (2008), Evaluating microarray-based classifiers: an overview, Cancer informatics 6, 77–97.
Chun, H. & Keles, S. (2009), Sparse partial least squares regression for simultaneous dimension reduction and variable selection, Journal of the Royal Statistical Society. Series B, Statistical Methodology 72(1), 325. *http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2810828/
Chung, D. & Keles, S. (2010), Sparse partial least squares classification for high dimensional data, Statistical Applications in Genetics and Molecular Biology
(1), 17.
Dalmau, O., Alarcón, T. E. & González, G. (2015), Kernel multilogit algorithm for multiclass classification, Computational Statistics and Data Analysis 82, 199–206.
Dong, K., Zhang, F., Zhu, Z., Wang, Z. & Wang, G. (2014), Partial least squares based gene expression analysis in posttraumatic stress disorder, European Review for Medical and Pharmacological Sciences 18, 2306–2310.
Dudoit, S., Fridlyand, J. & Speed, T. (2002), Comparison of discrimination methods for the classification of tumors using gene expression data, Journal of the American Statistical Association 97(457), 77–86.
Fort, G. & Lambert-Lacroix, S. (2005), Classification using partial least squares with penalized logistic regression, Bioinformatics 7, 1104–1111.
Gagnon-Bartsch, J. A. & Speed, T. P. (2011), Using control genes to correct for unwanted variation in microarray data, Biostatistics 13(3), 539–552.
*http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3577104/
Gromski, S., Muhamadali, H., Ellis, D., Xu, Y., Correa, E., Turner, M. & Goodcare, R. (2015), A tutorial review: Metabolomics and partial least squares-discriminant analysis a marriage of convenience or a shotgun wedding, Analytica Chimica Acta 879, 10–23.
Gusnanto, A., Ploner, A., Shuweihdi, F. & Pawitan, Y. (2013), Partial least squares and logistic regression random-effects estimates for gene selection in supervised classification of gene expression data, Journal of Biomedical Informatics pp. 697–709.
Höskuldsson, A. (1988), PLS regression methods, Journal of Chemometrics 2, 211–228.
Huang, C. C., Tu, S. H., Huang, C. H., Lien, H. H., Lai, L. H. & Chuang, E. (2013), Multiclass prediction with partial least square regression for gene expression data: Applications in breast cancer intrinsic taxonomy, BioMed Research International pp. 1–9.
Lê Cao, K., Rossouw, D., Robert-Granieé, C. & Besse, P. (2008), A Sparse PLS for variable selection when integrating omics data, Statistical Applications in Genetics and Molecular Biology 7(1).
Lee, D., Lee, W., Lee, Y. & Pawitan, Y. (2011), Sparse partial least- squares regression and its applications to high-throughput data analysis, Chemometrics and Intel ligent Laboratory Systems 109(1), 1–8.
Nguyen, D. V. & Rocke, D. M. (2002a), Multi-class cancer classification via partial least squares with gene expression profiles, Bioinformatics 18(9), 1216–1226.
Nguyen, D. V. & Rocke, D. M. (2002b), Tumor classification by partial least squares using microarray gene expression data, Bioinformatics 18(1), 39–50.
Telaar, A., Liland, K., Repsilber, D. & Nürnberg, G. (2013), An extension of PPLS-DA for classification and comparison to ordinary PLS-DA, PLoS ONE 8 2, e55267.
Wagala, A. (2018), Problems in Statistical Genetics: Classification and Testing for Network Changes, PhD thesis, Centro de Investigación en Matemáticas A. C., Department of Probability & Statistics. *https://cimat.repositorioinstitucional.mx
Wang, A., An, N., Chen, G., Li, L. & Alterovitz, G. (2015), Improving plsrfe based gene selection for microarray data classification, Computers in Biology and Medicine 62, 14–24.
Wold, S., Ruhe, A., Wold, W. & Dunn III, W. J. (1984), The collinearity problem in linear regression, the partial least squares approach to generalized inverses, SIAM Journal on Scientific and Statistical Computing 5(3), 735–743.
Wold, S., Sjöström, M. & Erikson, L. (2001), PLS-regression: A basic tool of chemometrics., Chemometrics and Intel ligent Laboratory Systems 58, 109–130.
Xi, B., Gu, H., Baniasadi, H. & Raftery, D. (2014), Statistical analysis and modeling of mass spectrometry-based metabolomics data, Methods Mol Biol. 1198, 333–353.
How to Cite
APA
ACM
ACS
ABNT
Chicago
Harvard
IEEE
MLA
Turabian
Vancouver
Download Citation
CrossRef Cited-by
1. Hongming Zhang, Lifu Zhang, Sa Wang, LinShan Zhang. (2022). Online water quality monitoring based on UV–Vis spectrometry and artificial neural networks in a river confluence near Sherfield-on-Loddon. Environmental Monitoring and Assessment, 194(9) https://doi.org/10.1007/s10661-022-10118-4.
Dimensions
PlumX
Article abstract page views
Downloads
License
Copyright (c) 2020 Revista Colombiana de Estadística

This work is licensed under a Creative Commons Attribution 4.0 International License.
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).