Published

2015-01-01

Accounting for Model Selection Uncertainty: Model Averaging of Prevalence and Force of Infection Using Fractional Polynomials

Un método para la inclusión de la incertidumbre en la selección del modelo: promedio de modelos para la prevalencia y la fuerza de infección usando polinomios fraccionarios

DOI:

https://doi.org/10.15446/rce.v38n1.48808

Keywords:

Bias, Mean Squared Error, Multimodel Estimation, Seroprevalence (en)
Error cuadrado medio, Estimación multi-modelo, Seroprevalencia, Sesgo (es)

Downloads

Authors

  • Javier Castañeda Medtronic Bakken Research Center, Maastricht, Netherlands
  • Marc Aerts CenStat, Universiteit Hasselt, Diepenbeek, Belgium

In most applications in statistics the true model underlying data generation mechanisms is unknown and researchers are confronted with the critical issue of model selection uncertainty. Often this uncertainty is ignored and the model with the best goodness-of-fit is assumed as the data generating model, leading to over-confident inferences. In this paper we present a methodology to account for model selection uncertainty in the estimation of age-dependent prevalence and force of infection, using model averaging of fractional polynomials. We illustrate the method on a seroprevalence crosssectional sample of hepatitis A, taken in 1993 in Belgium. In a simulation study we show that model averaged prevalence and force of infection using fractional polynomials have desirable features such as smaller mean squared error and more robust estimates as compared with the general practice of estimation based only on one selected “best” model.

En la mayoría de aplicaciones en estadística se desconoce el verdadero modelo que determina el mecanismo de generación de los datos, y los investigadores deben confrontarse con la incertidumbre en la selección del modelo. En muchas ocasiones esta incertidumbre es ignorada cuando solo se usa el modelo que mejor ajusta los datos observados, lo cual conlleva a estimaciones con nivel de confianza menor a los deseados. Las enfermedades infecciosas pueden ser estudiadas por medio de parámetros tales como la prevalencia dependiente de la edad y la fuerza de infección. En este trabajo nosotros estimamos estos dos parámetros mediante polinomios fraccionarios y proponemos el uso de promedio de modelos para incluir la variabilidad debida a la incertidumbre en la selección del modelo. Nosotros ilustramos esta metodología usando una muestra de seroprevalencia de hepatitis A en Bélgica en 1993. Por medio de simulaciones mostramos que la metodología propuesta en este artículo tiene atractivas propiedades tales como menor erro cuadrado medio y estimaciones más robustas comparado con la frecuente práctica de estimación basada en un único modelo.

https://doi.org/10.15446/rce.v38n1.48808

Accounting for Model Selection Uncertainty: Model Averaging of Prevalence and Force of Infection Using Fractional Polynomials

Un método para la inclusión de la incertidumbre en la selección del modelo: promedio de modelos para la prevalencia y la fuerza de infección usando polinomios fraccionarios

JAVIER CASTAÑEDA1, MARC AERTS2

1Medtronic Bakken Research Center, Maastricht, Netherlands. Principal Statistician. Email: javier.castaneda@medtronic.com
2Universiteit Hasselt, CenStat, Diepenbeek, Belgium. Director. Email: marc.aerts@uhasselt.be


Abstract

In most applications in statistics the true model underlying data generation mechanisms is unknown and researchers are confronted with the critical issue of model selection uncertainty. Often this uncertainty is ignored and the model with the best goodness-of-fit is assumed as the data generating model, leading to over-confident inferences. In this paper we present a methodology to account for model selection uncertainty in the estimation of age-dependent prevalence and force of infection, using model averaging of fractional polynomials. We illustrate the method on a seroprevalence cross-sectional sample of hepatitis A, taken in 1993 in Belgium. In a simulation study we show that model averaged prevalence and force of infection using fractional polynomials have desirable features such as smaller mean squared error and more robust estimates as compared with the general practice of estimation based only on one selected "best" model.

Key words: Bias, Mean Squared Error, Multimodel Estimation, Seroprevalence.


Resumen

En la mayoría de aplicaciones en estadística se desconoce el verdadero modelo que determina el mecanismo de generación de los datos, y los investigadores deben confrontarse con la incertidumbre en la selección del modelo. En muchas ocasiones esta incertidumbre es ignorada cuando solo se usa el modelo que mejor ajusta los datos observados, lo cual conlleva a estimaciones con nivel de confianza menor a los deseados. Las enfermedades infecciosas pueden ser estudiadas por medio de parámetros tales como la prevalencia dependiente de la edad y la fuerza de infección. En este trabajo nosotros estimamos estos dos parámetros mediante polinomios fraccionarios y proponemos el uso de promedio de modelos para incluir la variabilidad debida a la incertidumbre en la selección del modelo. Nosotros ilustramos esta metodología usando una muestra de seroprevalencia de hepatitis A en Bélgica en 1993. Por medio de simulaciones mostramos que la metodología propuesta en este artículo tiene atractivas propiedades tales como menor erro cuadrado medio y estimaciones más robustas comparado con la frecuente práctica de estimación basada en un único modelo

Palabras clave: error cuadrado medio, estimación multi-modelo, seroprevalencia, sesgo.


Texto completo disponible en PDF


References

1. Agresti, A. (2002), Categorical data analysis, 2nd edition, John Wiley & Sons, New York.

2. Akaike, H. (1974), 'A new look at the statistical identification model', IEEE transactions on automatic control 19, 716-723.

3. Beutels, M., Damme, P. V. & Aelvoet, W. (1997), 'Prevalence of hepatitis A, B and C in the flemish population', European Journal of Epidemiology 13, 275-280.

4. Buckland, S., Burnham, K. & Augustin, N. (1997), 'Model selection: an integral part of inference', Biometrics 53, 603-618.

5. Burnham, K. & Anderson, D. (2002), Model selection and multi model inference. A practical information-theoretic approach, 2, Springer, New York.

6. Castañeda, J. & Gerritse, B. (2010), 'Appraisal of several methods to model time to multiple events per subject: modelling time to hospitalizations and death', Revista Colombiana de Estadística 11, 43-61.

7. Faes, C., Aerts, M., Geys, H. & Molenberghs, G. (2007), 'Model averaging using fractional polynomials to estimate a safe level of exposure', Risk Analysis 27(1), 111-123.

8. Farrington, C. (1990), 'Modeling forces of infection for measles, mumps and rubella', Statistics in Medicine 9, 953-967.

9. Goeyvaerts, N., Hens, N., Ogunjimi, B., Aerts, M., Shkedy, Z., Damme, P. V. & Beutels, P. (2010), 'Estimating infectious disease parameters from data on social contacts and serological status', Journal of the Royal Statistical Society. Series C (Applied Statistics) 59(2), 255-277.

10. Hens, N., Shkedy, Z., Aerts, M., Faes, C., Van Damme, P. & Beutels, P. (2012), Modeling Infectious Disease Parameters Based on Serological and Social Contact Data, 1st edition, Springer.

11. Hoeting, J., Madigan, D., Raftery, A. & Volinsky, C. (1999), 'Bayesian model averaging: a tutorial', Statistical Science 14(4), 382-401.

12. Keiding, N. (1991), 'Age-specific incidence and prevalence: a statistical perspective', Journal of the Royal Statistical Society. Series A (Statistics in Society) 154(3), 371-412.

13. Kullback, S. & Leibler, R. A. (1951), 'On information and sufficiency', Annals of Mathematical Statistics 22(1), 79-86.

14. Royston, P. & Altman, D. G. (1994), 'Regression using fractional polynomials of continuous covariates: parsimonious parametric modelling', Journal of the Royal Statistical Society. Series C (Applied Statistics) 43(3), 429-467.

15. Shkedy, Z., Aerts, M., Molenberghs, G., Beutels, P. & Damme, P. V. (2003), 'Modelling forces of infection by using monotone local polynomials', Journal of the Royal Statistical Society: Series C (Applied Statistics) 52(4), 469-485.

16. Shkedy, Z., Aerts, M., Molenberghs, G., Beutels, P., , & Damme, P. V. (2006), 'Modelling age-dependent force of infection from prevalence data using fractional polynomials', Statistics in Medicine 25(9), 1577-1591.


[Recibido en octubre de 2013. Aceptado en noviembre de 2014]

Este artículo se puede citar en LaTeX utilizando la siguiente referencia bibliográfica de BibTeX:

@ARTICLE{RCEv38n1a09,
    AUTHOR  = {Castañeda, Javier and Aerts, Marc},
    TITLE   = {{Accounting for Model Selection Uncertainty: Model Averaging of Prevalence and Force of Infection Using Fractional Polynomials}},
    JOURNAL = {Revista Colombiana de Estadística},
    YEAR    = {2015},
    volume  = {38},
    number  = {1},
    pages   = {163-179}
}

References

Agresti, A. (2002), Categorical data analysis, 2nd edition, John Wiley & Sons, New York

Akaike, H. (1974), ‘A new look at the statistical identification model’, IEEE transactions on automatic control 19, 716–723.

Beutels, M., Damme, P. V. & Aelvoet, W. (1997), ‘Prevalence of hepatitis A, B and C in the flemish population’, European Journal of Epidemiology 13, 275–280.

Buckland, S., Burnham, K. & Augustin, N. (1997), ‘Model selection: An integral part of inference’, Biometrics 53, 603–618.

Burnham, K. & Anderson, D. (2002), Model selection and multi model inference. A practical information-theoretic approach, 2, Springer, New York.

Castañeda, J. & Gerritse, B. (2010), ‘Appraisal of several methods to model time to multiple events per subject: Modelling time to hospitalizations and death’, Revista Colombiana de Estadística 11, 43–61.

Faes, C., Aerts, M., Geys, H. & Molenberghs, G. (2007), ‘Model averaging using fractional polynomials to estimate a safe level of exposure’, Risk Analysis 27(1), 111–123.

Farrington, C. (1990), ‘Modeling forces of infection for measles, mumps and rubella’, Statistics in Medicine 9, 953–967.

Goeyvaerts, N., Hens, N., Ogunjimi, B., Aerts, M., Shkedy, Z., Damme, P. V. & Beutels, P. (2010), ‘Estimating infectious disease parameters from data on social contacts and serological status’, Journal of the Royal Statistical Society. Series C (Applied Statistics) 59(2), 255–277.

Hens, N., Shkedy, Z., Aerts, M., Faes, C., Van Damme, P. & Beutels, P. (2012), Modeling Infectious Disease Parameters Based on Serological and Social Contact Data, 1st edition, Springer.

Hoeting, J., Madigan, D., Raftery, A. & Volinsky, C. (1999), ‘Bayesian model averaging: A tutorial’, Statistical Science 14(4), 382–401.

Keiding, N. (1991), ‘Age-specific incidence and prevalence: A statistical perspective’, Journal of the Royal Statistical Society. Series A (Statistics in Society) 154(3), 371–412.

Kullback, S. & Leibler, R. A. (1951), ‘On information and sufficiency’, Annals of Mathematical Statistics 22(1), 79–86.

Royston, P. & Altman, D. G. (1994), ‘Regression using fractional polynomials of continuous covariates: Parsimonious parametric modelling’, Journal of the Royal Statistical Society. Series C (Applied Statistics) 43(3), 429–467.

Shkedy, Z., Aerts, M., Molenberghs, G., Beutels, P., & Damme, P. V. (2006), ‘Modelling age-dependent force of infection from prevalence data using fractional polynomials’, Statistics in Medicine 25(9), 1577–1591.

Shkedy, Z., Aerts, M., Molenberghs, G., Beutels, P. & Damme, P. V. (2003), ‘Modelling forces of infection by using monotone local polynomials’, Journal of the Royal Statistical Society: Series C (Applied Statistics) 52(4), 469–485.

How to Cite

APA

Castañeda, J. and Aerts, M. (2015). Accounting for Model Selection Uncertainty: Model Averaging of Prevalence and Force of Infection Using Fractional Polynomials. Revista Colombiana de Estadística, 38(1), 163–179. https://doi.org/10.15446/rce.v38n1.48808

ACM

[1]
Castañeda, J. and Aerts, M. 2015. Accounting for Model Selection Uncertainty: Model Averaging of Prevalence and Force of Infection Using Fractional Polynomials. Revista Colombiana de Estadística. 38, 1 (Jan. 2015), 163–179. DOI:https://doi.org/10.15446/rce.v38n1.48808.

ACS

(1)
Castañeda, J.; Aerts, M. Accounting for Model Selection Uncertainty: Model Averaging of Prevalence and Force of Infection Using Fractional Polynomials. Rev. colomb. estad. 2015, 38, 163-179.

ABNT

CASTAÑEDA, J.; AERTS, M. Accounting for Model Selection Uncertainty: Model Averaging of Prevalence and Force of Infection Using Fractional Polynomials. Revista Colombiana de Estadística, [S. l.], v. 38, n. 1, p. 163–179, 2015. DOI: 10.15446/rce.v38n1.48808. Disponível em: https://revistas.unal.edu.co/index.php/estad/article/view/48808. Acesso em: 28 mar. 2024.

Chicago

Castañeda, Javier, and Marc Aerts. 2015. “Accounting for Model Selection Uncertainty: Model Averaging of Prevalence and Force of Infection Using Fractional Polynomials”. Revista Colombiana De Estadística 38 (1):163-79. https://doi.org/10.15446/rce.v38n1.48808.

Harvard

Castañeda, J. and Aerts, M. (2015) “Accounting for Model Selection Uncertainty: Model Averaging of Prevalence and Force of Infection Using Fractional Polynomials”, Revista Colombiana de Estadística, 38(1), pp. 163–179. doi: 10.15446/rce.v38n1.48808.

IEEE

[1]
J. Castañeda and M. Aerts, “Accounting for Model Selection Uncertainty: Model Averaging of Prevalence and Force of Infection Using Fractional Polynomials”, Rev. colomb. estad., vol. 38, no. 1, pp. 163–179, Jan. 2015.

MLA

Castañeda, J., and M. Aerts. “Accounting for Model Selection Uncertainty: Model Averaging of Prevalence and Force of Infection Using Fractional Polynomials”. Revista Colombiana de Estadística, vol. 38, no. 1, Jan. 2015, pp. 163-79, doi:10.15446/rce.v38n1.48808.

Turabian

Castañeda, Javier, and Marc Aerts. “Accounting for Model Selection Uncertainty: Model Averaging of Prevalence and Force of Infection Using Fractional Polynomials”. Revista Colombiana de Estadística 38, no. 1 (January 1, 2015): 163–179. Accessed March 28, 2024. https://revistas.unal.edu.co/index.php/estad/article/view/48808.

Vancouver

1.
Castañeda J, Aerts M. Accounting for Model Selection Uncertainty: Model Averaging of Prevalence and Force of Infection Using Fractional Polynomials. Rev. colomb. estad. [Internet]. 2015 Jan. 1 [cited 2024 Mar. 28];38(1):163-79. Available from: https://revistas.unal.edu.co/index.php/estad/article/view/48808

Download Citation

CrossRef Cited-by

CrossRef citations1

1. Hugo Aguirre-Villaseñor, Enrique Morales-Bojórquez, Elaine Espino-Barr. (2022). Implementation of sigmoidal models with different functional forms to estimate length at 50% maturity: A case study of the Pacific red snapper Lutjanus peru. Fisheries Research, 248, p.106204. https://doi.org/10.1016/j.fishres.2021.106204.

Dimensions

PlumX

Article abstract page views

331

Downloads

Download data is not yet available.