Publicado
A Graphical Diagnostic Test for Two-Way Contingency Tables
Un prueba gráfica de diagnóstico para tablas de contingencia de doble entrada
DOI:
https://doi.org/10.15446/rce.v39n1.55142Palabras clave:
Contingency Tables, Diagnostics, Statistical Graphics (en)Diagnósticos, Gráficos estadísticos, Tablas de contingencia (es)
Descargas
We propose and illustrate a new graphical method to perform diagnostic analyses in two-way contingency tables. In this method, one observation is added or removed from each cell at a time, whilst the other cells are held constant, and the change in a test statistic of interest is graphically represented. The method provides a very simple way of determining how robust our model is (and hence our conclusions) to small changes introduced to the data. We illustrate via four examples, three of them from real-world applications, how this method works
Proponemos e ilustramos un nuevo método gráfico para realizar análisis de diagníistico en tablas de contingencia de doble entrada. En este método, se adiciona o remueve una observación de cada celda a la vez mientras las demás se mantienen constantes, y el cambio en un estadístico de interés se representa gráficamente. El método proporciona una manera simple de determinar cuán robusto es nuestro modelo (y por lo tanto nuestras conclusiones) cuando se introducen pequeños cambios en los datos. Ilustramos cómo funciona el método con cuatro ejemplos, tres de ellos con datos reales
1John Curtin School of Medical Research, Australian National University, The Arcos-Burgos Group, Canberra, ACT, Australia. University of Antioquia, Neuroscience Research Group, Medellín, Colombia. National University of Colombia, Department of Statistics, Research Group in Statistics, Medellín, Colombia. Ph.D. Scholar. Email: jorge.velez@anu.edu.au
2Stockholm University, Frescati Hagväg, Department of Psychology, Gösta Ekman Laboratory, Stockholm, Sweden. Postdoctoral Fellow. Email: fernando.marmolejo.ramos@psychology.su.se
3National University of Colombia, Department of Statistics, Research Group in Statistics, Medellín, Colombia. National University of Colombia, Department of Statistics, Medellín, Colombia. Associate Professor. Email: jcorrea@unal.edu.co
We propose and illustrate a new graphical method to perform diagnostic analyses in two-way contingency tables. In this method, one observation is added or removed from each cell at a time, whilst the other cells are held constant, and the change in a test statistic of interest is graphically represented. The method provides a very simple way of determining how robust our model is (and hence our conclusions) to small changes introduced to the data. We illustrate via four examples, three of them from real-world applications, how this method works.
Key words: Contingency Tables, Diagnostics, Statistical Graphics.
Proponemos e ilustramos un nuevo método gráfico para realizar análisis de diagníistico en tablas de contingencia de doble entrada. En este método, se adiciona o remueve una observación de cada celda a la vez mientras las demas se mantienen constantes, y el cambio en un estadíistico de interés se representa gráficamente. El método proporciona una manera simple de determinar cuán robusto es nuestro modelo (y por lo tanto nuestras conclusiones) cuando se introducen pequeños cambios en los datos. Ilustramos cómo funciona el método con cuatro ejemplos, tres de ellos con datos reales.
Palabras clave: diagnósticos, gráficos estadíisticos, tablas de contingencia.
Texto completo disponible en PDF
References
1. Agresti, A. (2002), Categorical Data Analysis, 2 edn, Jhon Wiley & Sons, Hoboken, New Jersey, New York.
2. Andersen, E. B. (1992), 'Diagnostics in Categorical Data Analysis', Journal of the Royal Statistical Society, Series B 54(3), 781-791.
3. Belsey, D. A., Kuh, E. & Welsch, R. E. (1980), Regression Diagnostics: Identifying Influential Data and Sources of Collinearity, John Wiley & Sons, New York.
4. Carlier, A. & Ewing, G. (1992), 'Longitudinal analysis of contingency tables with application to demographic data', Computational Statistics 7, 329-353.
5. Clarke, G. M., Anderson, C. A., Pettersson, F. H., Cardon, L. R., Morris, A. P. & Zondervan, K. T. (2011), 'Basic statistical analysis in genetic case-control studies', Nature Protocols 6(2), 121-133.
6. Correa, J. C. (2002), Un estudio de habilidad manual en niñs en edad escolar en Antioquia, Colombia, Unpublished manuscript.
7. Correa, J. C. & Vélez, J. I. (2014), 'Una nota de cuidado sobre el efecto de datos parcialmente faltantes en la prueba de independencia \chi^2', Comunicaciones en Estadística 7(2), 189-199.
8. Cung, C. (2013), Crime and Demographics: An Analysis of LAPD Crime Data, M. Sc. thesis, UCLA, Department of Statistics, Los Angeles, California.
9. Dickhaus, T., Straßburger, K., Schunk, D., Morcillo-Suarez, C., Illig, T. & Navarro, A. (2012), 'How to analyze many contingency tables simultaneously in genetic association studies', Statistical Applications in Genetics and Molecular Biology 11(4), 1544-6115.
10. Friendly, M. (1994), 'Mosaic Displays for Multi-Way Contingency Tables', Journal of the American Statistical Association 89(425), 190-200.
11. Friendly, M. (1995), 'Conceptual and Visual Models for Categorical Data', The American Statistician 49(2), 153-160.
12. Friendly, M. (1999), 'Extending Mosaic Displays: Marginal, Conditional, and Partial Views of Categorical Data', Journal of Computational and Graphical Statistics 8(3), 373-395.
13. Fuchs, C. & Kennet, R. (1980), 'A Test for Detecting Outlying Cells in the Multinomial Distribution and Two-Way Contingency Tables', Journal of the American Statistical Association 75(370), 395-398.
14. Genest, C. & Green, P. E. J. (1987), 'A Graphical Display of Association in Two-Way Contingency Tables', The Statistician 36, 371-380.
15. Geweke, J. (2007), 'Bayesian Model Comparison and Validation', The American Economic Review 97(2), 60-64.
16. Grizzle, J. E., Starmer, C. F. & Koch, G. G. (1969), 'Analysis of Categorical Data by Linear Models', Biometrics 25(3), 489-504.
17. Harrell, J. F. E. (2001), Regression modelling strategies: with applications to linear models, logistic regression, and survival analysis, 1 edn, Springer Science, New York.
18. Hosmer, D. & Lemeshow, S. (1989), Applied Logistic Regression, Jhon Wiley & Sons, New York.
19. Iossifova, R. & Marmolejo-Ramos, F. (2013), 'When the body is time: spatial and temporal deixis in children with visual impairments and sighted children', Research in Developmental Disabilities 34(7), 2173-2184.
20. Kamish, H. J. (1988), 'Contingency table estimation of genetic parameters and disease risks', Statistics in Medicine 7(5), 591-600.
21. Kleijnen, J. (1999), Validation of Models: Statistical Techniques and Data Availability, Discussion Paper 1999-104, Tilburg University, Center for Economic Research. *http://ideas.repec.org/p/dgr/kubcen/1999104.html
22. Lustbader, E. D. & Moolgavkar, S. H. (1985), 'A Diagnostic Statistic for the Score Test', Journal of the American Statistical Association 80(390), 375-379.
23. MacCullagh, P. (2002), 'What is a Statistical Model?', The Annals of Statistics 30(5), 1225-1310.
24. Marcus, A. H. & Elias, R. W. (1998), 'Some Useful Statistical Methods for Model Validation', Environmental Health Perspectives 106, 1541-1550.
25. R Core Team, (2015), R: A Language and Environment for Statistical Computing, R Foundation for Statistical Computing, Vienna, Austria. *http://www.R-project.org/
26. Simonoff, J. S. (1988), 'Detecting Outlying Cells in Two-Way Contingency Tables Via Backwards-Stepping', Technometrics 30(3), 339-345.
27. Simonoff, J. S. (2003), Analyzing Categorical Data, Springer, New York.
28. Snee, R. D. (1977), 'Validation of Regression Models: Methods and Examples', Technometrics 19(4), 415-428.
29. Tsujitani, M. & Koch, G. G. (1991), 'Residual Plots for Log Odds Ratio Regression Models', Biometrics 47(3), 1135-1141.
30. Vlachos, F., Avramidis, E., Dedousis, G., Katsigianni, E., Ntalla, I., Giannakopoulou, M. & Chalmpe, M. (2013), 'Incidence and Gender Differences for Handedness among Greek Adolescents and Its Association with Familial History and Brain Injury', Research in Psychology and Behavioral Sciences 1(1), 6-10.
31. Wickens, T. D. (1969), Multiway Contingency Tables Analysis for the Social Sciences, 1 edn, Psychology Press, United States.
32. Wickham, H. & Chang, W. (2015), devtools: Tools to Make Developing R Packages Easier. R package version 1.7.0. *http://CRAN.R-project.org/package=devtools
Este artículo se puede citar en LaTeX utilizando la siguiente referencia bibliográfica de BibTeX:
@ARTICLE{RCEv39n1a07,
AUTHOR = {Vélez, Jorge Iván and Marmolejo-Ramos, Fernando and Correa, Juan Carlos},
TITLE = {{A Graphical Diagnostic Test for Two-Way Contingency Tables}},
JOURNAL = {Revista Colombiana de Estadística},
YEAR = {2016},
volume = {39},
number = {1},
pages = {97-108}
}
Referencias
Agresti, A. (2002), Categorical Data Analysis, 2 edn, Jhon Wiley & Sons, Hoboken, New Jersey, New York.
Andersen, E. B. (1992), ‘Diagnostics in Categorical Data Analysis’, Journal of the Royal Statistical Society, Series B 54(3), 781–791.
Belsey, D. A., Kuh, E. & Welsch, R. E. (1980), Regression Diagnostics: Identifying Influential Data and Sources of Collinearity, John Wiley & Sons, New York.
Carlier, A. & Ewing, G. (1992), ‘Longitudinal analysis of contingency tables with application to demographic data’, Computational Statistics 7, 329–353.
Clarke, G. M., Anderson, C. A., Pettersson, F. H., Cardon, L. R., Morris, A. P. & Zondervan, K. T. (2011), ‘Basic statistical analysis in genetic case-control studies’, Nature Protocols 6(2), 121–133.
Correa, J. C. (2002), Un estudio de habilidad manual en niños en edad escolar en Antioquia, Colombia, Unpublished manuscript.
Correa, J. C. & Vélez, J. I. (2014), ‘Una nota de cuidado sobre el efecto de datos parcialmente faltantes en la prueba de independencia 2’, Comunicaciones en Estadística 7(2), 189–199.
Cung, C. (2013), Crime and Demographics: An Analysis of LAPD Crime Data, M. sc. thesis, UCLA, Department of Statistics, Los Angeles, California.
Dickhaus, T., StraSSburger, K., Schunk, D., Morcillo-Suarez, C., Illig, T. & Navarro, A. (2012), ‘How to analyze many contingency tables simultaneously in genetic association studies’, Statistical Applications in Genetics and Molecular Biology 11(4), 1544–6115.
Friendly, M. (1994), ‘Mosaic Displays for Multi-Way Contingency Tables’, Journal of the American Statistical Association 89(425), 190–200.
Friendly, M. (1995), ‘Conceptual and Visual Models for Categorical Data’, The American Statistician 49(2), 153–160.
Friendly, M. (1999), ‘Extending Mosaic Displays: Marginal, Conditional, and Partial Views of Categorical Data’, Journal of Computational and Graphical Statistics 8(3), 373–395.
Fuchs, C. & Kennet, R. (1980), ‘A Test for Detecting Outlying Cells in the Multinomial Distribution and Two-Way Contingency Tables’, Journal of the American Statistical Association 75(370), 395–398.
Genest, C. & Green, P. E. J. (1987), ‘A Graphical Display of Association in Two- Way Contingency Tables’, The Statistician 36, 371–380.
Geweke, J. (2007), ‘Bayesian Model Comparison and Validation’, The American Economic Review 97(2), 60–64.
Grizzle, J. E., Starmer, C. F. & Koch, G. G. (1969), ‘Analysis of Categorical Data by Linear Models’, Biometrics 25(3), 489–504.
Harrell, Jr., F. E. (2001), Regression modelling strategies: with applications to linear models, logistic regression, and survival analysis, 1 edn, Springer Science, New York.
Hosmer, D. & Lemeshow, S. (1989), Applied Logistic Regression, Jhon Wiley & Sons, New York.
Iossifova, R. & Marmolejo-Ramos, F. (2013), ‘When the body is time: spatial and temporal deixis in children with visual impairments and sighted children’, Research in Developmental Disabilities 34(7), 2173–2184.
Kamish, H. J. (1988), ‘Contingency table estimation of genetic parameters and disease risks’, Statistics in Medicine 7(5), 591–600.
Kleijnen, J. (1999), Validation of Models: Statistical Techniques and Data Availability, Discussion Paper 1999-104, Tilburg University, Center for Economic Research.
*http://ideas.repec.org/p/dgr/kubcen/1999104.html
Lustbader, E. D. & Moolgavkar, S. H. (1985), ‘A Diagnostic Statistic for the Score Test’, Journal of the American Statistical Association 80(390), 375–379.
MacCullagh, P. (2002), ‘What is a Statistical Model?’, The Annals of Statistics 30(5), 1225–1310.
Marcus, A. H. & Elias, R. W. (1998), ‘Some Useful Statistical Methods for Model Validation’, Environmental Health Perspectives 106, 1541–1550.
R Core Team (2015), R: A Language and Environment for Statistical Computing, R Foundation for Statistical Computing, Vienna, Austria. *http://www.R-project.org/
Simonoff, J. S. (1988), ‘Detecting Outlying Cells in Two-Way Contingency Tables Via Backwards-Stepping’, Technometrics 30(3), 339–345.
Simonoff, J. S. (2003), Analyzing Categorical Data, Springer, New York.
Snee, R. D. (1977), ‘Validation of Regression Models: Methods and Examples’, Technometrics 19(4), 415–428.
Tsujitani, M. & Koch, G. G. (1991), ‘Residual Plots for Log Odds Ratio Regression Models’, Biometrics 47(3), 1135–1141.
Vlachos, F., Avramidis, E., Dedousis, G., Katsigianni, E., Ntalla, I., Giannakopoulou, M. & Chalmpe, M. (2013), ‘Incidence and Gender Differences for Handedness among Greek Adolescents and Its Association with Familial History and Brain Injury’, Research in Psychology and Behavioral Sciences 1(1), 6–10.
Wickens, T. D. (1969), Multiway Contingency Tables Analysis for the Social Sciences, 1 edn, Psychology Press, United States.
Wickham, H. & Chang, W. (2015), devtools: Tools to Make Developing R Packages Easier. R package version 1.7.0. *http://CRAN.R-project.org/package=devtools
Cómo citar
APA
ACM
ACS
ABNT
Chicago
Harvard
IEEE
MLA
Turabian
Vancouver
Descargar cita
CrossRef Cited-by
1. Shakeel Shahzad, Saud Ahmed Khan, Atiq Ur Rehman. (2022). Most Stringent Test of Independence in W× K Contingency Tables for Nominal Data Using Monte Carlo Simulations. Sustainable Business and Society in Emerging Economies, 4(1), p.45. https://doi.org/10.26710/sbsee.v4i1.2145.
2. ÖZGE KARADAĞ, GÖKÇEN ALTUN, SERPIL AKTAŞ. (2020). Assessment of SNP-SNP interactions by using square contingency table analysis. Anais da Academia Brasileira de Ciências, 92(3) https://doi.org/10.1590/0001-3765202020190465.
3. Piotr Sulewski. (2019). The LMS for testing independence in two-way contingency tables. Biometrical Letters, 56(1), p.17. https://doi.org/10.2478/bile-2019-0003.
Dimensions
PlumX
Visitas a la página del resumen del artículo
Descargas
Licencia
Derechos de autor 2016 Revista Colombiana de Estadística
Esta obra está bajo una licencia internacional Creative Commons Atribución 4.0.
- Los autores/as conservarán sus derechos de autor y garantizarán a la revista el derecho de primera publicación de su obra, el cuál estará simultáneamente sujeto a la Licencia de reconocimiento de Creative Commons (CC Atribución 4.0) que permite a terceros compartir la obra siempre que se indique su autor y su primera publicación esta revista.
- Los autores/as podrán adoptar otros acuerdos de licencia no exclusiva de distribución de la versión de la obra publicada (p. ej.: depositarla en un archivo telemático institucional o publicarla en un volumen monográfico) siempre que se indique la publicación inicial en esta revista.
- Se permite y recomienda a los autores/as difundir su obra a través de Internet (p. ej.: en archivos telemáticos institucionales o en su página web) antes y durante el proceso de envío, lo cual puede producir intercambios interesantes y aumentar las citas de la obra publicada. (Véase El efecto del acceso abierto).