Published

2015-07-01

Bootstrap-based inference for grouped data

Inferencia para datos agrupados vía bootstrap

DOI:

https://doi.org/10.15446/rev.fac.cienc.v4n2.54254

Keywords:

Bootstrap, estimation, grouped Data (en)
Bootstrap, datos agrupados, estimación (es)

Downloads

Authors

  • Jorge Iván Vélez Australian National University
  • Juan Carlos Correa Morales
Grouped data refers to continuous variables that are partitioned in intervals, not necessarily of the same length, to facilitate its interpretation.  Unlike in ungrouped data, estimating simple summary statistics as the mean and mode, or more complex ones as a percentile or the coefficient of variation, is a difficult endeavour in grouped data. When the probability distribution generating the data is unknown, inference in ungrouped data is carried out using parametric or nonparametric resampling methods. However, there are no equivalent methods in the case of grouped data.  Here, a bootstrap-based procedure to estimate the parameters of an unknown distribution based on grouped data is proposed, described and illustrated.

Los datos agrupados se reeren a variables continuas que se dividen en intervalos no necesariamente de la misma longitud para facilitar su interpretación. Contrario a lo que ocurre en datos no agrupados, la estimación de simples estadísticos de resumen como la media o la moda, o más complejos como un percentil o el coeciente de variación, es una tarea difícil en datos agrupados. Cuando no se conoce la distribución de probabilidad que genera los datos, la inferencia en datos no agrupados se realiza utilizando métodos paramétricos o no paramétricos de remuestreo. Sin embargo, no existen métodos equivalentes para datos agrupados. En este documento se propone, describe e ilustra un método basado en bootstrap para estimar los parámetros de una distribución desconocida a partir de datos agrupados.

References

Carpenter, J. & Bithell, J. (2000), Bootstrap Confidence Intervals: When, Which, What? A Practical Guide for Medical Statisticians, Statistics in Medicine, 19(9), 1141-1164.

Davison, A. C.; Hinkley, D. V. & Young, G. A. (2003), Recent Developments in Bootstrap Methodology, Statistical Science, 18(2), 141-157.

DiCiccio, T. J. & Efron, B. (1996), Bootstrap Confidence Intervals, Statistical Science, 11 (3), 189-228.

Efron, B. (1979), Bootstrap Methods: Another Look at the Jackknife, The Annals of Statistics, 7 (1), 1-27.

Efron, B. (1987), Better Bootstrap Confidence Intervals, Journal of the American Statistical Association, 82 (397), 171-185.

Efron, B. (2003), Second Thoughts on the Bootstrap, Statistical Science, 18( 2), 135-140.

Hajargasht, G.; Gri_ths, W. E.; Brice, J. & Rao, D. P. & Chotikapanich, D. (2012), Inference for income distributions using grouped data, Journal of Business & Economic Statistics, 30( 4), 563-575.

Harrell, F. E. & Davis, C. E. (1982), A New Distribution-Free Quantile Estimator, Biometrika, 69( 3), 635-640.

Heitjan, D. F. (1989), Inference from grouped continuous data: A review, Statistical Science, 4(2), 164-179.

Hinkley, D. V. (1988), Bootstrap Methods, Journal of the Royal Statistical Society. Series B (Methodological), 50(3), 321-337.

Kanazawa, Y. (1992), An Optimal Variable Cell Histogram Based on the Sample Spacings, The Annals of Statistics, 20 (1), 291-304.

Letson, D. & McCullogh, B. D. (1998), Better Confidence Intervals: The Double Bootstrap with No Pivot, American Journal of Agricultural Economics, 80(3), 552-559.

On, C. W. (2002), Mean, Variance and Standard Deviation for Grouped Data, http://www.angelfire.com/blues/michaelyang/ive/dms/chapter_05/5_6_StaDev.html. Accessed: 2015-11-18.

Pierce, R. (2014), Mean, Median and Mode from Grouped Frequencies, https://www.mathsisfun.com/data/frequency-grouped-mean-median-mode.html. Accessed: 2015-11-18.

R Core Team (2015), R: A Language and Environment for Statistical Computing, R Foundation for Statistical Computing, Vienna, Austria. URL: http://www.R-project.org/.

Scott, D. W. (1979), On Optimal and Data-Based Histograms, Biometrika, 66 (3), 605-610.

Scott, D. W. & Scott, W. R. (2008), Smoothed Histograms for Frequency Data on Irregular Intervals, The American Statistician, 62(3), 256-261.

Taylor, C. C. (1987), Akaike's Information Criterion and the Histogram, Biometrika, 74(3), 636-639.

Vélez, J. I. & Correa, J. C. (2014), Should we think of a different Median estimator?, Revista Comunicaciones en Estadística, 7( 2), 1-8.

Wand, M. P. (1997), Data-based Choice of Histogram Bin Width, The American Statistician, 51(1), 59-64.

Zhan, Y. & Wellner, J. A. (1995), Double censoring: characterization and computation of the nonparametric maximum likelihood estimator, Technical report.

How to Cite

APA

Vélez, J. I. and Correa Morales, J. C. (2015). Bootstrap-based inference for grouped data. Revista de la Facultad de Ciencias, 4(2), 74–82. https://doi.org/10.15446/rev.fac.cienc.v4n2.54254

ACM

[1]
Vélez, J.I. and Correa Morales, J.C. 2015. Bootstrap-based inference for grouped data. Revista de la Facultad de Ciencias. 4, 2 (Jul. 2015), 74–82. DOI:https://doi.org/10.15446/rev.fac.cienc.v4n2.54254.

ACS

(1)
Vélez, J. I.; Correa Morales, J. C. Bootstrap-based inference for grouped data. Rev. Fac. Cienc. 2015, 4, 74-82.

ABNT

VÉLEZ, J. I.; CORREA MORALES, J. C. Bootstrap-based inference for grouped data. Revista de la Facultad de Ciencias, [S. l.], v. 4, n. 2, p. 74–82, 2015. DOI: 10.15446/rev.fac.cienc.v4n2.54254. Disponível em: https://revistas.unal.edu.co/index.php/rfc/article/view/54254. Acesso em: 21 nov. 2024.

Chicago

Vélez, Jorge Iván, and Juan Carlos Correa Morales. 2015. “Bootstrap-based inference for grouped data”. Revista De La Facultad De Ciencias 4 (2):74-82. https://doi.org/10.15446/rev.fac.cienc.v4n2.54254.

Harvard

Vélez, J. I. and Correa Morales, J. C. (2015) “Bootstrap-based inference for grouped data”, Revista de la Facultad de Ciencias, 4(2), pp. 74–82. doi: 10.15446/rev.fac.cienc.v4n2.54254.

IEEE

[1]
J. I. Vélez and J. C. Correa Morales, “Bootstrap-based inference for grouped data”, Rev. Fac. Cienc., vol. 4, no. 2, pp. 74–82, Jul. 2015.

MLA

Vélez, J. I., and J. C. Correa Morales. “Bootstrap-based inference for grouped data”. Revista de la Facultad de Ciencias, vol. 4, no. 2, July 2015, pp. 74-82, doi:10.15446/rev.fac.cienc.v4n2.54254.

Turabian

Vélez, Jorge Iván, and Juan Carlos Correa Morales. “Bootstrap-based inference for grouped data”. Revista de la Facultad de Ciencias 4, no. 2 (July 1, 2015): 74–82. Accessed November 21, 2024. https://revistas.unal.edu.co/index.php/rfc/article/view/54254.

Vancouver

1.
Vélez JI, Correa Morales JC. Bootstrap-based inference for grouped data. Rev. Fac. Cienc. [Internet]. 2015 Jul. 1 [cited 2024 Nov. 21];4(2):74-82. Available from: https://revistas.unal.edu.co/index.php/rfc/article/view/54254

Download Citation

CrossRef Cited-by

CrossRef citations1

1. Zahra AghahosseinaliShirazi, João Pedro A. R. da Silva, Camila P. E. de Souza. (2024). Parameter estimation for grouped data using EM and MCEM algorithms. Communications in Statistics - Simulation and Computation, 53(8), p.3616. https://doi.org/10.1080/03610918.2022.2108843.

Dimensions

PlumX

Article abstract page views

575

Downloads

Download data is not yet available.