Published

2022-01-01

Variable Selection in Switching Dynamic Regression Models

Selección de variables en modelos de regresión dinámicos de cambios de régimen

DOI:

https://doi.org/10.15446/rce.v45n1.85385

Keywords:

Dirichlet process, State-space model, Hierarchical model, Bayesian filtering and smoothing (en)
Proceso Dirichlet, Modelo de Espacio-estado, Modelo jerárquico, Filtering and smoothing bayesianos (es)

Downloads

Authors

  • Dayna P. Saldaña-Zepeda Universidad de Colima
  • Ciro Velasco-Cruz Colegio de Postgraduados
  • Víctor H. Torres-Preciado Universidad de Colima

Complex dynamic phenomena in which dynamics is related to events (modes) that cause structural changes over time, are well described by the switching linear dynamical system (SLDS). We extend the SLDS by allowing the measurement noise to be mode-specific, a flexible way to model non stationary data. Additionally, for models that are functions of explanatory variables, we adapt a variable selection method to identify which of them are significant in each mode. Our proposed model is a flexible Bayesian nonparametric model that allows to learn about the number of modes and their location, and within each mode, it identifies the significant variables and estimates the regression coefficients. The model performance is evaluated by simulation and two application examples from a dataset of meteorological time series of Barranquilla, Colombia are presented.

Fenómenos dinámicos complejos en los que la dinámica está relacionada con eventos (modos) que provocan cambios estructurales a lo largo del tiempo, se aproximan mediante un sistema dinámico lineal de cambio de régimen (SDLR). Extendemos el SDLR al permitir que el error de medición sea específico del modo, una forma flexible de modelar datos no estacionarios. Además, para los modelos que son funciones de variables explicativas, adaptamos un método de selección de variables para identificar cuáles de ellas son significativas en cada modo. El modelo propuesto es un modelo bayesiano no paramétrico flexible que permite conocer el número de modos y su ubicación, y dentro de cada modo, identifica las variables significativas
y estima los coeficientes de regresión. El desempeño del modelo se evalúa
mediante simulación y se presentan dos ejemplos de aplicación de un conjunto de datos de series de tiempo meteorológicas de Barranquilla, Colombia.

References

Antoniak, C. (1974), ‘Mixtures of dirichlet processes with applications to bayesian nonparametric problems’, The Annals of Statistics 2(6), 1152–1174. DOI: https://doi.org/10.1214/aos/1176342871

Barber, D. (2012), Bayesian Reasoning and Machine Learning, Cambridge University Press. DOI: https://doi.org/10.1017/CBO9780511804779

Bishop, C. (2006), Pattern Recognition and Machine Learning, Springer.

Blackwell, D. & MacQueen, J. (1973), ‘Ferguson distributions via Polya urn schemes’, The Annals of Statistics 1(2), 353–355. DOI: https://doi.org/10.1214/aos/1176342372

Bregler, C. (1997, June), Learning and recognizing human dynamics in video sequences, in ‘Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition’, pp. 568–574.

Carvalho, C. & Lopes, H. (2007), ‘Simulation based sequential analysis of markov switching stochastic volatility models’, Computational Statistics and Data Analysis 51, 4526–4542. DOI: https://doi.org/10.1016/j.csda.2006.07.019

Du, K., Mu, C., Deng, J. & Yuan, F. (2013), ‘Study on atmospheric visibility variations and the impacts of meteorological parameters using high temporal resolution data: an application of environmental internet of things in china’, International Journal of Sustainable Development and World Ecology 20(3), 238–247. DOI: https://doi.org/10.1080/13504509.2013.783886

Escobar, M. (1988), Estimating the Means of Several Normal Populations by Nonparametric Estimation of the Distribution of the Means, PhD thesis, Yale University.

Escobar, M. & West, M. (1995), ‘Bayesian density estimation and inference using mixtures’, Journal of the American Statistical Association 90(430), 577–588. DOI: https://doi.org/10.1080/01621459.1995.10476550

Ferguson, T. (1973), ‘A bayesian analysis of some nonparametric problems’, The Annals of Statistics 1(2), 209–230. DOI: https://doi.org/10.1214/aos/1176342360

Fox, E., Sudderth, E., Jordan, M. & Willsky, A. (2011a), ‘Bayesian nonparametric inference of switching dynamic linear models’, IEEE Transactions on signal processing 59(4), 1569–1585. DOI: https://doi.org/10.1109/TSP.2010.2102756

Fox, E., Sudderth, E., Jordan, M. & Willsky, A. (2011b), ‘A sticky hdp-hmm with application to speaker diarization’, The Annals of Applied Statistics 5(2A), 1020–1056. DOI: https://doi.org/10.1214/10-AOAS395

Han, M., Ren, W. & Liu, X. (2015), ‘Joint mutual information-based input variable selection for multivariate time series modeling’, Engineering Applications of Artificial Intelligence 37, 250–257. DOI: https://doi.org/10.1016/j.engappai.2014.08.011

Huang, W., Tan, J., Kan, H., Zhao, N., Song, W., Song, G., Chen, G., Jiang, L., Jiang, C., Chen, R. & Chen, B. (2009), ‘Visibility, air quality and daily mortality in shanghai, china’, Science of The Total Environment 407(10), 3295–3300. DOI: https://doi.org/10.1016/j.scitotenv.2009.02.019

Huerta, G., Sansó, B. & Stroud, J. R. (2004), ‘A spatiotemporal model for mexico city ozone levels’, Journal of the Royal Statistical Society 53(2), 231–248. DOI: https://doi.org/10.1046/j.1467-9876.2003.05100.x

Ishwaran, H. & James, L. (2001), ‘Gibbs sampling methods for stick-breaking priors’, Journal of the American Statistical Association 96(453), 161–173. DOI: https://doi.org/10.1198/016214501750332758

Ishwaran, H. & James, L. (2002), ‘Approximate dirichlet process computing in finite normal mixtures: Smoothing and prior information’, Journal of Computational and Graphical Statistics 11(3), 1–26. DOI: https://doi.org/10.1198/106186002411

Ishwaran, H. & Zarepour, M. (2000), ‘Markov chain monte carlo in approximate dirichlet and beta two-parameter process hierarchical models’, Biometrika 87(2), 371–390. DOI: https://doi.org/10.1093/biomet/87.2.371

Ishwaran, H. & Zarepour, M. (2002a), ‘Dirichlet prior sieves in finite normal mixtures’, Statistica Sinica 12(3), 941–963.

Ishwaran, H. & Zarepour, M. (2002b), ‘Exact and approximate sum representations for the dirichlet process’, The Canadian Journal of Statistics 30(2), 269–283. DOI: https://doi.org/10.2307/3315951

Kalman, R. (1960), ‘A new approach to linear filtering and prediction problems’, Journal of Basic Engineering 82, 35–45. DOI: https://doi.org/10.1115/1.3662552

Kalman, R. (1963), ‘Mathematical description of linear dynamical systems’, Journal of the Society for Industrial and Applied Mathematics 1(2), 152–192. DOI: https://doi.org/10.1137/0301010

Kim, C. (1994), ‘Dynamic linear models with markov switching’, Journal of Econometrics 60(1-2), 1–22. DOI: https://doi.org/10.1016/0304-4076(94)90036-1

Kuo, L. & Mallick, B. (1998), ‘Variable selection for regression models’, The Indian Journal of Statistics. Special Issue on Bayesian Analysis 60(1), 65–81.

Lamon III, E., Carpenter, S. & Stow, C. (1998), ‘Forecasting PCB concentrations in Lake Michigan salmonids: a dynamic linear model approach’, Ecological Applications 8(3), 659–668. DOI: https://doi.org/10.1890/1051-0761(1998)008[0659:FPCILM]2.0.CO;2

MacEachern, S. N. (1994), ‘Estimating normal means with a conjugate style dirichlet process prior’, Communications in Statistics-Simulation and Computation 23(3), 727–741. DOI: https://doi.org/10.1080/03610919408813196

Majewski, G., Kleniewska, M. & Brandyk, A. (2011), ‘Seasonal variation of particulate matter mass concentration and content of metals’, Polish Journal of Environmental Studies 20(2), 417–427.

Majewski, G., Rogula-Kozłowska, W., Czechowski, P. O., Badyda, A. & Brandyk, A. (2015), ‘he impact of selected parameters on visibility: First results from a long-term campaign in warsaw, poland’, Atmosphere 6, 1154–1174. DOI: https://doi.org/10.3390/atmos6081154

McAlinn, K. & West, M. (2016), Dynamic bayesian predictive synthesis in time series forecasting, Technical report, Duke University.

Meinhold, R. & Singpurwalla, N. (1983), ‘Understanding the kalman filter’, The American Statistician 37(2), 123–127. DOI: https://doi.org/10.1080/00031305.1983.10482723

National Centers for Environmental Information (2021), ‘Local climatological data’. https://www.ncei.noaa.gov/data/local-climatological-data/

Pavlović, V., Rehg, J. & MacCormick, J. (2001), Learning switching linear models of human motion., in ‘Advances in Neural Information Processing Systems’, Vol. 13, Neural Information Processing Systems (NIPS) 2000.

Petris, G., Petrone, S. & Campagnoli, P. (2009), Dynamic Linear Models with R, Springer-Verlag. DOI: https://doi.org/10.1007/b135794_2

Rauch, H., Striebel, C. & Tung, F. (1965), ‘Maximum likelihood estimates of linear dynamic systems’, AIAA Journal 3(8), 1445–1450. DOI: https://doi.org/10.2514/3.3166

Redner, R. & Walker, H. (1984), ‘Mixture densities, maximum likelihood and the em algorithm’, SIAM Review 26(2), 195–239. DOI: https://doi.org/10.1137/1026034

Rodríguez, A. (2007), Some Advances in Bayesian Nonparametric Modeling, PhD thesis, Duke University.

Sethuraman, J. (1994), ‘A constructive definition of dirichlet priors’, Statistica Sinica 4, 639–650.

Stephens, M. (2000), ‘Dealing with label switching in mixture models’, Journal of the Royal Statistical Society 62(4), 795–809. DOI: https://doi.org/10.1111/1467-9868.00265

Teh, Y. W., Jordan, M. I., Beal, M. J. & Blei, D. M. (2006), ‘Hierarchical dirichlet processes’, Journal of the American Statistical Association 101, 1566–1581. DOI: https://doi.org/10.1198/016214506000000302

Thach, T.-Q., Wong, C.-M., Chan, K.-P., Chau, Y.-K., Chung, Y.-N., Ou, C.-Q., Yang, L. & Hedley, A. J. (2010), ‘Daily visibility and mortality: Assessment of health benefits from improved visibility in hong kong’, Environmental Research 110(6), 617–623. DOI: https://doi.org/10.1016/j.envres.2010.05.005

Tsai, Y., Kuo, S.-C., Lee, W.-J., Chen, C.-L. & Chen, P.-T. (2007), ‘Long-term visibility trends in one highly urbanized, one highly industrialized, and two rural areas of taiwan’, Science of The Total Environment 382(2-3), 324–341. DOI: https://doi.org/10.1016/j.scitotenv.2007.04.048

Velasco-Cruz, C., Leman, S. C., Hudy, M. & Smith, E. P. (2012), ‘Assessing the risk of rising temperature on brook trout: a spatial dynamic linear risk model’, Journal of Agricultural, Biological, and Environmental Statistics 17(2), 246–264. DOI: https://doi.org/10.1007/s13253-012-0088-8

Wang, L. & Wang, X. (2013), ‘Hierarchical dirichlet process model for gene expression clustering’, EURASIP Journal on Bioinformatics and Systems Biology 1(5). DOI: https://doi.org/10.1186/1687-4153-2013-5

Watson, A., Ramirez, C. & Salud, E. (2009), ‘Predicting visibility of aircraft’, PLOS ONE 5(7), 1–16. DOI: https://doi.org/10.1371/annotation/be07af21-d5b4-4cb3-b311-a3fc275cd9aa

West, M. (2013), Bayesian Dynamic Modelling, Oxford University Press, chapter 8.

West, M. & Harrison, J. (1997), Bayesian Forecasting and Dynamic Models, 2 edn, Springer.

Zeng, Y. & Wu, S., eds (2013), State-space models. Applications in Economics and Finance, Springer. DOI: https://doi.org/10.1007/978-1-4614-7789-1

How to Cite

APA

Saldaña-Zepeda, D. P., Velasco-Cruz, C. and Torres-Preciado, V. H. (2022). Variable Selection in Switching Dynamic Regression Models. Revista Colombiana de Estadística, 45(1), 231–263. https://doi.org/10.15446/rce.v45n1.85385

ACM

[1]
Saldaña-Zepeda, D.P., Velasco-Cruz, C. and Torres-Preciado, V.H. 2022. Variable Selection in Switching Dynamic Regression Models. Revista Colombiana de Estadística. 45, 1 (Jan. 2022), 231–263. DOI:https://doi.org/10.15446/rce.v45n1.85385.

ACS

(1)
Saldaña-Zepeda, D. P.; Velasco-Cruz, C.; Torres-Preciado, V. H. Variable Selection in Switching Dynamic Regression Models. Rev. colomb. estad. 2022, 45, 231-263.

ABNT

SALDAÑA-ZEPEDA, D. P.; VELASCO-CRUZ, C.; TORRES-PRECIADO, V. H. Variable Selection in Switching Dynamic Regression Models. Revista Colombiana de Estadística, [S. l.], v. 45, n. 1, p. 231–263, 2022. DOI: 10.15446/rce.v45n1.85385. Disponível em: https://revistas.unal.edu.co/index.php/estad/article/view/85385. Acesso em: 24 apr. 2024.

Chicago

Saldaña-Zepeda, Dayna P., Ciro Velasco-Cruz, and Víctor H. Torres-Preciado. 2022. “Variable Selection in Switching Dynamic Regression Models”. Revista Colombiana De Estadística 45 (1):231-63. https://doi.org/10.15446/rce.v45n1.85385.

Harvard

Saldaña-Zepeda, D. P., Velasco-Cruz, C. and Torres-Preciado, V. H. (2022) “Variable Selection in Switching Dynamic Regression Models”, Revista Colombiana de Estadística, 45(1), pp. 231–263. doi: 10.15446/rce.v45n1.85385.

IEEE

[1]
D. P. Saldaña-Zepeda, C. Velasco-Cruz, and V. H. Torres-Preciado, “Variable Selection in Switching Dynamic Regression Models”, Rev. colomb. estad., vol. 45, no. 1, pp. 231–263, Jan. 2022.

MLA

Saldaña-Zepeda, D. P., C. Velasco-Cruz, and V. H. Torres-Preciado. “Variable Selection in Switching Dynamic Regression Models”. Revista Colombiana de Estadística, vol. 45, no. 1, Jan. 2022, pp. 231-63, doi:10.15446/rce.v45n1.85385.

Turabian

Saldaña-Zepeda, Dayna P., Ciro Velasco-Cruz, and Víctor H. Torres-Preciado. “Variable Selection in Switching Dynamic Regression Models”. Revista Colombiana de Estadística 45, no. 1 (January 19, 2022): 231–263. Accessed April 24, 2024. https://revistas.unal.edu.co/index.php/estad/article/view/85385.

Vancouver

1.
Saldaña-Zepeda DP, Velasco-Cruz C, Torres-Preciado VH. Variable Selection in Switching Dynamic Regression Models. Rev. colomb. estad. [Internet]. 2022 Jan. 19 [cited 2024 Apr. 24];45(1):231-63. Available from: https://revistas.unal.edu.co/index.php/estad/article/view/85385

Download Citation

CrossRef Cited-by

CrossRef citations0

Dimensions

PlumX

Article abstract page views

223

Downloads

Download data is not yet available.