Published
Recent Advances in Visualizing Multivariate Linear Models
Avances recientes para la visualización de modelos lineales multivariados
DOI:
https://doi.org/10.15446/rce.v37n2spe.47934Keywords:
Graphics, Multivariate Analysis, Software, Visualization (en)Análisis multivariado, Gráficas, Software, Visualización (es)
This paper reviews our work in the development of visualization methods (implemented in R) for understanding and interpreting the effects of predictors in multivariate linear models (MLMs) of the form Y = XB + U, and some of their recent extensions.
We begin with a description of and examples from the Hypothesis-Error (HE) plots framework (utilizing the heplots package), wherein multivariate tests can be visualized via ellipsoids in 2D, 3D or all pairwise views for the Hypothesis and Error Sum of Squares and Products (SSP) matrices used in hypothesis tests. Such HE plots provide visual tests of significance: a term is significant by Roy’s test if and only if its H ellipsoid projects somewhere outside the E ellipsoid. These ideas extend naturally to repeated measures designs in the multivariate context. When the rank of the hypothesis matrix for a term exceeds 2, these effects can also be visualized in a reduced-rank canonical space via the candisc package, which also provides new data plots for canonical correlation problems. Finally, we discuss some recent work-in-progress: the extension of these methods to robust MLMs, and the development of generalizations of influence measures and diagnostic plots for MLMs (in the mvinfluence package).
Este artículo hace una revisión de los desarrollos recientes en métodos de visualización (implementados en R) para la comprensión e interpretación de los efectos de los predictores en modelos lineales multivariados (MLMs) de la forma Y = XB + U y sus extensiones recientes. Comenzamos con una descripción y ejemplos de los gráficos de Hipótesis- Error (HE), (utilizando el paquete heplots) en los cuales los tests multivariados son visualizados vía elipsoides en 2D, 3D o todas las vistas pareadas de las matrices de sumas de cuadrados y productos (SSP por sus siglas en inglés) de Hipótesis y Error. Las gráficas HE permiten pruebas de significancia visuales: un término es significativo en el test de Roy si y solo si su elipsoide H es proyectado fuera del elipsoide E. Estas ideas se extienden a diseños de medidas repetidas en el contexto multivariado. Cuando el rango de la matriz de hipótesis para un término es mayor a 2, estos efectos pueden ser visualizados en un espacio canónico de rango reducido vía el paquete candisc, que a su vez también permite nuevos gráficos para problemas de correlación canónica. Finalmente, se discuten algunas áreas de investigación en desarrollo: la extensión de estos métodos a MLMs robustos, generalizaciones de las medidas de influencia y gráficas de diagnóstico para MLMs (en el paquete mvinfluence).
https://doi.org/10.15446/rce.v37n2spe.47934
1York University, Faculty of Health, Department of Psychology, Toronto, Canada. Professor. Email: friendly@yorku.ca
2York University, Faculty of Health, Department of Psychology, Toronto, Canada. Professor. Email: msigal@yorku.ca
This paper reviews our work in the development of visualization methods (implemented in \R) for understanding and interpreting the effects of predictors in multivariate linear models (MLMs) of the form Y = X B + U, and some of their recent extensions.
We begin with a description of and examples from the Hypothesis-error (HE) plots framework (utilizing the \Rpackage{heplots}), wherein multivariate tests can be visualized via ellipsoids in 2D, 3D or all pairwise views for the Hypothesis and Error Sum of Squares and Products (SSP) matrices used in hypothesis tests. Such HE plots provide visual tests of significance: a term is significant by Roys test if and only if its H ellipsoid projects somewhere outside the E ellipsoid. These ideas extend naturally to repeated measures designs in the multivariate context.
When the rank of the hypothesis matrix for a term exceeds 2, these effects can also be visualized in a reduced-rank canonical space via the \Rpackage{candisc}, which also provides new data plots for canonical correlation problems. Finally, we discuss some recent work-in-progress: the extension of these methods to robust MLMs, development of generalizations of influence measures and diagnostic plots for MLMs (in the \Rpackage{mvinfluence}).
Key words: Graphics, Multivariate Analysis, Software, Visualization.
Este artículo hace una revisión de los desarrollos recientes en métodos de visualización (implementados en \R) para la comprensión e interpretación de los efectos de los predictores en modelos lineales multivariados (MLMs) de la forma Y = X B + U y sus extensiones recientes.
Comenzamos con una descripción y ejemplos de los gráficos de Hipótesis-Error (HE), (utilizando el paquete heplots) en los cuales los tests multivariados son visualizados vía elipsoides en 2D, 3D o todas las vistas pareadas de las matrices de sumas de cuadrados y productos (SSP por sus siglas en inglés) de Hipótesis y Error. Las gráficas HE permiten pruebas de significancia visuales: un término es significativo en el test de Roy si y solo si su elipsoide H es proyectado fuera del elipsoide E. Estas ideas se extienden a diseños de medidas repetidas en el contexto multivariado.
Cuando el rango de la matriz de hipótesis para un término es mayor a 2, estos efectos pueden ser visualizados en un espacio canónico de rango reducido vía el paquete candisc, que a su vez también permite nuevos gráficos para problemas de correlación canónica. Finalmente, se discuten algunas áreas de investigación en desarrollo: la extensión de estos métodos a MLMs robustos, generalizaciones de las medidas de influencia y gráficas de diagnóstico para MLMs (en el paquete mvinfluence).
Palabras clave: análisis multivariado, gráficas, software, visualización.
Texto completo disponible en PDF
References
1. Cook, R. D. & Weisberg, S. (1982), Residuals and Influence in Regression, Chapman and Hall, New York.
2. Tubb, A., Parker, A. & Nickless, G. (1980), 'The analysis of Romano-British pottery by atomic absorption spectrophotometry', Archaeometry 22, 153-171.
Este artículo se puede citar en LaTeX utilizando la siguiente referencia bibliográfica de BibTeX:
@ARTICLE{RCEv37n2a02,
AUTHOR = {Friendly, Michael and Sigal, Matthew},
TITLE = {{Recent Advances in Visualizing Multivariate Linear Models}},
JOURNAL = {Revista Colombiana de Estadística},
YEAR = {2014},
volume = {37},
number = {2},
pages = {261-283}
}
References
Barrett, B. E. & Ling, R. F. (1992), ‘General classes of influence measures for multivariate regression’, Journal of the American Statistical Association 87(417), 184–191.
Cook, R. D. &Weisberg, S. (1982), Residuals and Influence in Regression, Chapman and Hall, New York.
Dempster, A. P. (1969), Elements of Continuous Multivariate Analysis, Addison- Wesley, Reading, MA.
Fisher, R. A. (1936), ‘The use of multiple measurements in taxonomic problems’, Annals of Eugenics 8, 379–388.
Fox, J., Friendly, M. & Monette, G. (2013), heplots: Visualizing Tests in Multivariate Linear Models. R package version 1.0-11.
*http://CRAN.R-project.org/package=heplots
Fox, J., Friendly, M. & Weisberg, S. (2013), ‘Hypothesis tests for multivariate linear models using the car package’, The R Journal 5(1), 39–52.
Friendly, M. (2007), ‘HE plots for multivariate general linear models’, Journal of Computational and Graphical Statistics 16(2), 421–444. doi: 10.1198/106186007X208407.
*http://www.math.yorku.ca/SCS/Papers/jcgs-heplots.pdf
Friendly, M. (2010), ‘HE plots for repeated measure designs’, Journal of Statistical Software 37(4), 1–37.
*http://www.jstatsoft.org/v37/i04
Friendly, M. (2012), mvinfluence: Influence Measures and Diagnostic Plots for Multivariate Linear Models. R package version 0.6. *http://CRAN.R-project.org/package=mvinfluence
Friendly, M. & Fox, J. (2013), candisc: Visualizing Generalized Canonical Discriminant and Canonical Correlation Analysis. R package version 0.6-5.
*http://CRAN.R-project.org/package=candisc
Friendly, M., Monette, G. & Fox, J. (2013), ‘Elliptical insights: Understanding statistical methods through elliptical geometry’, Statistical Science 28(1), 1–39.
doi: http://dx.doi.org/10.1214/12-STS402.
*http://datavis.ca/papers/ellipses.pdf
Gabriel, K. R. (1971), ‘The biplot graphic display of matrices with application to principal components analysis’, Biometrics 58(3), 453–467.
Gabriel, K. R. (1981), Biplot display of multivariate matrices for inspection of data and diagnosis, in V. Barnett, ed., ‘Interpreting Multivariate Data’, John Wiley and Sons, London, chapter 8, pp. 147–173.
Gelman, A., Pasarica, C. & Dodhia, R. (2002), ‘Let’s practice what we teach: Turning tables into graphs’, The American Statistician 56(2), 121–130.
Gittins, R. (1985), Canonical Analysis: A Review with Applications in Ecology, Springer-Verlag, Berlin.
Gnanadesikan, R. & Kettenring, J. R. (1972), ‘Robust estimates, residuals, and outlier detection with multiresponse data’, Biometrics 28, 81–124.
Gower, J., Lubbe, S. & Roux, N. (2011), Understanding Biplots, Wiley.
*http://books.google.ca/books?id=66gQCi5JOKYC
Kastellec, J. P. & Leoni, E. L. (2007), ‘Using graphs instead of tables in political science’, Perspectives on Politics 5(04), 755–771. doi: 0.1017/S1537592707072209.
McCulloch, C. E. & Meeter, D. (1983), ‘Discussion of outliers... by R. J. Beckman and R. D. Cook’, Technometrics 25, 152–155.
Monette, G. (1990), Geometry of multiple regression and interactive 3-D graphics, in J. Fox & S. Long, eds, ‘Modern Methods of Data Analysis’, Sage Publications, Beverly Hills, CA, chapter 5, pp. 209–256.
R Core Team (2013), R: A Language and Environment for Statistical Computing, R Foundation for Statistical Computing, Vienna, Austria.
Rousseeuw, P. & Leroy, A. (1987), Robust Regression and Outlier Detection, John Wiley and Sons, New York.
Rousseeuw, P. & Van Driessen, K. (1999), ‘A fast algorithm for the mínimum covariance determinant estimator’, Technometrics 41, 212–223.
Timm, N. H. (1975), Multivariate Analysis with Applications in Education and Psychology, Wadsworth (Brooks/Cole), Belmont, CA.
Tubb, A., Parker, A. & Nickless, G. (1980), ‘The analysis of Romano-British pottery by atomic absorption spectrophotometry’, Archaeometry 22, 153–171.
How to Cite
APA
ACM
ACS
ABNT
Chicago
Harvard
IEEE
MLA
Turabian
Vancouver
Download Citation
CrossRef Cited-by
1. Panini Dasgupta, M. K. Roxy, Rajib Chattopadhyay, C. V. Naidu, Abirlal Metya. (2021). Interannual variability of the frequency of MJO phases and its association with two types of ENSO. Scientific Reports, 11(1) https://doi.org/10.1038/s41598-021-91060-2.
2. David Luna, Ivette Vargas de la Cruz, Laura Fernanda Barrera Hernández, Rosa Paola Figuerola Escoto, Amalia Guadalupe Gómez Cotero, Filiberto Toledano-Toledano. (2024). Risky Sexual Behaviors in Women and Their Relationship with Alcohol Consumption, Tobacco, and Academic Stress: A Multiple Correspondence Analysis Approach. Sexes, 5(4), p.498. https://doi.org/10.3390/sexes5040035.
3. Michael Friendly, Matthew Sigal. (2017). Graphical methods for multivariate linear models in psychological research: An R tutorial. The Quantitative Methods for Psychology, 13(1), p.20. https://doi.org/10.20982/tqmp.13.1.p020.
4. Rogério de Souza Nóia Júnior, Genilda Canuto Amaral, José Eduardo Macedo Pezzopane, Mariana Duarte Silva Fonseca, Ana Paula Câmara da Silva, Talita Miranda Teixeira Xavier. (2020). Ecophysiological acclimatization to cyclic water stress in Eucalyptus. Journal of Forestry Research, 31(3), p.797. https://doi.org/10.1007/s11676-019-00926-9.
5. Michael Friendly, John Fox. (2008). https://doi.org/10.32614/CRAN.package.candisc.
6. André Moraes Reis, Antônio Nazareno Guimarães Mendes, Juliana Costa de Rezende Abrahão, Meline de Oliveira Santos, Vânia Aparecida Silva. (2022). Early selection of drought-tolerant Coffea arabica genotypes at the seedling stage using functional divergence. Pesquisa Agropecuária Tropical, 52 https://doi.org/10.1590/1983-40632022v5272412.
Dimensions
PlumX
Article abstract page views
Downloads
License
Copyright (c) 2014 Revista Colombiana de Estadística
This work is licensed under a Creative Commons Attribution 4.0 International License.
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).