Published
A Multilevel Nonparametric Bayesian Model
Un Modelo Bayesiano No Paramétrico Multinivel
DOI:
https://doi.org/10.15446/rce.v48n3.122305Keywords:
Nonparametric Bayesian model, Dirichlet process, Chinese Restaurant Process, Clustering, Linear regression. (en)Modelo Bayesiano no paramétrico, Proceso de Dirichlet, Proceso del Restaurante Chino, Agrupamiento, Regresión lineal. (es)
Downloads
This work presents the development of a multilevel Bayesian nonparametric model that allows for the estimation of linear relationships in heterogeneous data sets, while simultaneously identifying clusters without the need to specify the number of groups in advance. The study includes the mathematical development of the model using the Chinese Restaurant Process and the implementation of algorithms for its fitting. The results obtained from real data show that the model performs well in both clustering data and characterizing linear relationships, achieving results comparable and even better to those obtained by traditional parametric methods.
Este trabajo presenta el desarrollo de un modelo Bayesiano no paramétrico multinivel diseñado para estimar relaciones lineales en conjuntos de datos heterogéneos, mientras identifica simultáneamente conglomerados sin requerir la especificación previa del número de grupos. El estudio incluye la formulación matemática del modelo utilizando el Proceso del Restaurante Chino, así como la implementación de algoritmos para su ajuste. Los resultados obtenidos con datos reales muestran que el modelo tiene un buen desempeño tanto en la agrupación de los datos como en la caracterización de las relaciones lineales, alcanzando resultados comparables e incluso superiores a los obtenidos mediante métodos paramétricos tradicionales.
References
Barnes III, T. G., Jefferys, W., Berger, J., Mueller, P. J., Orr, K. & Rodriguez, R. (2003), `A Bayesian analysis of the cepheid distance scale', The Astrophysical Journal 592(1), 539.
Bishop, C. M. (2006), Pattern Recognition and Machine Learning, Information Science and Statistics, Springer, New York.
Blackwell, D. & MacQueen, J. B. (1973), `Ferguson distributions via urn schemes', Annals of Statistics 1(2), 353-355.
Blei, D. M. (2007), `Lecture 1: Bayesian nonparametrics', Lecture notes for COS 597C: Bayesian Nonparametrics. Scribes: Peter Frazier and Indraneel Mukherjee.
Blei, D. M., Kucukelbir, A. & McAuliffe, J. D. (2017), `Variational inference: A review for statisticians', Journal of the American statistical Association 112(518), 859-877.
Bouchard-Côté, A. (2011), Statistical Modeling with Stochastic Processes, PhD thesis, University of British Columbia, Vancouver, Canada.
Clyde, M. & George, E. I. (2000), `Flexible empirical Bayes estimation for wavelets', Journal of the Royal Statistical Society Series B: Statistical Methodology 62(4), 681-698.
Dunson, D. B. (2010), `Nonparametric Bayes applications to biostatistics', Bayesian nonparametrics 28, 223-273.
Ewens, W. J. (1990), Population genetics theory-the past and the future, in `Mathematical and statistical developments of evolutionary theory', Springer, pp. 177-227.
Ferguson, T. S. (1973), `A Bayesian analysis of some nonparametric problems', The annals of statistics pp. 209-230.
Foti, N. J. & Williamson, S. A. (2013), `A survey of non-exchangeable priors for Bayesian nonparametric models', IEEE transactions on pattern analysis and machine intelligence 37(2), 359-371.
Frigyik, B. A., Kapila, A. & Gupta, M. R. (2010), Introduction to the Dirichlet distribution and related processes, UWEE Technical Report UWEETR-2010- 0006, University of Washington, Department of Electrical Engineering.
Gamerman, D. & Lopes, H. F. (2006), Markov chain Monte Carlo: stochastic simulation for Bayesian inference, Chapman and Hall/CRC.
Gelman, A., Carlin, J. B., Stern, H. S., Dunson, D. B., Vehtari, A. & Rubin, D. B. (2013), Bayesian Data Analysis, 3 edn, Chapman and Hall/CRC, Boca Raton, FL.
Hanson, T. & Johnson, W. O. (2002), `Modeling regression error with a mixture of Pólia trees', Journal of the American Statistical Association 97(460), 1020-1033.
Hanson, T. & Johnson, W. O. (2004), `A Bayesian semiparametric aft model for interval-censored data', Journal of Computational and Graphical Statistics 13(2), 341-361.
Hoff, P. D. (2009), A first course in Bayesian statistical methods, Vol. 580, Springer.
Jara, A. (2017), `Theory and computations for the Dirichlet process and related models: An overview', International Journal of Approximate Reasoning 81, 128-146.
Kamper, H. (2013), `Gibbs sampling for fitting finite and infinite Gaussian mixture models', Technical report.
Kass, R. E. & Wasserman, L. (1995), `A reference Bayesian test for nested hypotheses and its relationship to the schwarz criterion', Journal of the American statistical association 90(431), 928-934.
MacEachern, S. N. (1999), Dependent nonparametric processes, in `ASA proceedings of the section on Bayesian statistical science', Vol. 1, Alexandria, VA, pp. 50-55.
Müeller, P., Quintana, F. A. & Page, G. (2018), `Nonparametric Bayesian inference in applications', Statistical Methods & Applications 27(2), 175-206.
Müller, P., Erkanli, A. & West, M. (1996), `Bayesian curve fitting using multivariate normal mixtures', Biometrika 83(1), 67-79.
Müller, P. & Mitra, R. (2013), `Bayesian nonparametric inference why and how', Bayesian analysis (Online) 8(2), 10-1214.
Müller, P., Quintana, F. A., Jara, A. & Hanson, T. (2015), Bayesian nonparametric data analysis, Vol. 1, Springer.
Murphy, K. P. (2012), Machine learning: a probabilistic perspective, MIT press.
Navarro, D. J. & Perfors, A. (2023), The chinese restaurant process. Lecture notes,
University of Adelaide.
Orhan, E. (2012), `Bayesian statistics: Dirichlet processes', Lecture notes. Unpublished manuscript.
Polson, N. G. & Scott, J. G. (2011), `On the half-cauchy prior for a global scale parameter'.
Quintana, F. A., Müller, P., Jara, A. & MacEachern, S. N. (2022), `The dependent dirichlet process and related models', Statistical Science 37(1), 24-41.
Schörgendorfer, A., Branscum, A. J. & Hanson, T. E. (2013), `A Bayesian goodness of t test and semiparametric generalization of logistic regression with measurement data', Biometrics 69(2), 508-519.
Sosa, J. & Aristizabal, J.-P. (2022), `Some developments in Bayesian hierarchical linear regression modeling', Revista Colombiana de Estadística 45(2), 231-255.
Teh, Y. W. (2010), Dirichlet processes, in C. Sammut & G. I. Webb, eds, `Encyclopedia of Machine Learning', Springer, pp. 280 287. https://www.stats.ox ac.uk/~teh/research/npBayes/Teh2010a.pdf
Theodoridis, S. (2020), Machine Learning: A Bayesian and Optimization Perspective, second edn, Academic Press.
Tibshirani, R. (1996), `Regression shrinkage and selection via the lasso', Journal of the Royal Statistical Society Series B: Statistical Methodology 58(1), 267-288.
Walker, S. & Mallick, B. K. (1999), `A Bayesian semiparametric accelerated failure time model', Biometrics 55(2), 477-483.
West, M. (1992), Hyperparameter estimation in dirichlet process mixture models, Discussion Paper 92-A03, Institute of Statistics and Decision Sciences, Duke University.
Williams, C. K. & Rasmussen, C. E. (2006), Gaussian processes for machine learning, Vol. 2, MIT press Cambridge, MA.
Xu, Y., Müller, P., Wahed, A. S. & Thall, P. F. (2016), `Bayesian nonparametric estimation for dynamic treatment regimes with sequential transition times', Journal of the American Statistical Association 111(515), 921-950.
Xuan, J., Lu, J. & Zhang, G. (2019), `A survey on Bayesian nonparametric learning', ACM Computing Surveys (CSUR) 52(1), 1-36.
How to Cite
APA
ACM
ACS
ABNT
Chicago
Harvard
IEEE
MLA
Turabian
Vancouver
Download Citation
License

This work is licensed under a Creative Commons Attribution 4.0 International License.
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).






