Comparison among High Dimensional Covariance Matrix Estimation Methods

Comparación entre métodos de estimación de matrices de covarianza de alta dimensionalidad

KAROLL GÓMEZ1, SANTIAGO GALLÓN2

1Universidad Nacional de Colombia, Facultad de Ciencias Humanas y Económicas, Departamento de Economía, Medellín, Colombia. Universidad de Antioquia, Facultad de Ciencias Económicas, Grupo de Econometría Aplicada, Medellín, Colombia. Assistant professor. Email: kgomezp@unal.edu.co
2Universidad de Antioquia, Facultad de Ciencias Económicas, Departamento de Estadística y Matemáticas - Departamento de Economía, Medellín, Colombia. Universidad de Antioquia, Facultad de Ciencias Económicas, Grupo de Econometría Aplicada, Medellín, Colombia. Assistant professor. Email: santiagog@udea.edu.co


Abstract

Accurate measures of the volatility matrix and its inverse play a central role in risk and portfolio management problems. Due to the accumulation of errors in the estimation of expected returns and covariance matrix, the solution to these problems is very sensitive, particularly when the number of assets (p) exceeds the sample size (T). Recent research has focused on developing different methods to estimate high dimensional covariance matrixes under small sample size. The aim of this paper is to examine and compare the minimum variance optimal portfolio constructed using five different estimation methods for the covariance matrix: the sample covariance, RiskMetrics, factor model, shrinkage and mixed frequency factor model. Using the Monte Carlo simulation we provide evidence that the mixed frequency factor model and the factor model provide a high accuracy when there are portfolios with p closer or larger than T.

Key words: Covariance matrix, High dimensional data, Penalized least squares, Portfolio optimization, Shrinkage.


Resumen

Medidas precisas para la matriz de volatilidad y su inversa son herramientas fundamentales en problemas de administración del riesgo y portafolio. Debido a la acumulación de errores en la estimación de los retornos esperados y la matriz de covarianza la solución de estos problemas son muy sensibles, en particular cuando el número de activos (p) excede el tamaño muestral (T). La investigación reciente se ha centrado en desarrollar diferentes métodos para estimar matrices de alta dimensión bajo tamaños muestrales pequeños. El objetivo de este artículo consiste en examinar y comparar el portafolio óptimo de mínima varianza construido usando cinco diferentes métodos de estimación para la matriz de covarianza: la covarianza muestral, el RiskMetrics, el modelo de factores, el shrinkage y el modelo de factores de frecuencia mixta. Usando simulación Monte Carlo hallamos evidencia de que el modelo de factores de frecuencia mixta y el modelo de factores tienen una alta precisión cuando existen portafolios con p cercano o mayor que T.

Palabras clave: matrix de covarianza, datos de alta dimension, mínimos cuadrados penalizados, optimización de portafolio, shrinkage.


Texto completo disponible en PDF


References

1. Andersen, T., Bollerslev, T., Diebold, F. & Labys, P. (2003), 'Modeling and forecasting realized volatility', Econometrica 71(2), 579-625.

2. Anderson, H., Issler, J. & Vahid, F. (2006), 'Common features', Journal of Econometrics 132(1), 1-5.

3. Bannouh, K., Martens, M., Oomen, R. & van Dijk, D. (2010), Realized mixed frequency factor models for vast dimensional covariance estimation, Discussion Paper , Econometric Institute, Erasmus Rotterdam University.

4. Barndorff-Nielsen, O., Hansen, P., Lunde, A. & Shephard, N. (2010), 'Multivariate realised kernels: consistent positive semi-definite estimators of the covariation of equity prices with noise and non-synchronous trading', Journal of Econometrics 162(2), 149-169.

5. Barndorff-Nielsen, O. & Shephard, N. (2004), 'Econometric analysis of realized covariation: high frequency based covariance, regression, and correlation in financial economics', Econometrica 72(3), 885-925.

6. Bickel, P. & Levina, E. (2008), 'Regularized estimation of large covariance matrices', The Annals of Statistics 36(1), 199-227.

7. Bollerslev, T. (1990), 'Modelling the coherence in short-run nominal exchange rates: a multivariate generalized ARCH approach', Journal of Portfolio Management 72(3), 498-505.

8. Bollerslev, T., R., E. & Wooldridge, J. (1988), 'A capital asset pricing model with time varying covariances', Journal of Political Economy 96(1), 116-131.

9. Buhlmann, P. & van de Geer, S. (2011), Statistics for High-Dimensional Data: Methods, Theory and Applications, Springer Series in Statistics, Springer. Berlin.

10. Chan, L., Karceski, J. & Lakonishok, J. (1999), 'On portfolio optimization: forecasting covariances and choosing the risk model', Review of Financial Studies 12(5), 937-974.

11. Chopra, V. & Ziemba, W. (1993), 'The effect of errors in means, variance and covariances on optimal portfolio choice', Journal of Portfolio Management 19(2), 6-11.

12. Dempster, A. (1979), 'Covariance selection', Biometrics 28(1), 157-175.

13. Efron, B., Hastie, T., Johnstone, I. & Tibshirani, R. (2004), 'Least angle regression', The Annals of Statistics 32(2), 407-499.

14. Engle, R. & Kroner, K. (1995), 'Multivariate simultaneous generalized ARCH', Econometric Theory 11(1), 122-150.

15. Engle, R., Shephard, N. & Sheppard, K. (2008), Fitting vast dimensional time-varying covariance models, Discussion Paper Series 403, Department of Economics, University of Oxford.

16. Fama, E. & French, K. (1992), 'The cross-section of expected stock returns', Journal of Financial Economics 47(2), 427-465.

17. Fan, J., Fan, Y. & Lv, J. (2008), 'High dimensional covariance matrix estimation using a factor model', Journal of Econometrics 147(1), 186-197.

18. Fan, J., Zhang, J. & Yu, K. (2009), Asset allocation and risk assessment with gross exposure constraints for vast portfolios, Department of Operations Research and Financial Engineering, Princeton University. Manuscrit.

19. Furrer, R. & Bengtsson, T. (2006), 'Estimation of high-dimensional prior and posterior covariance matrices in Kalman filter variants', Journal of Multivariate Analysis 98(2), 227-255.

20. Hastie, T., Tibshirani, R. & Friedman, J. (2009), The Elements of Statistical Learning: Data Mining, Inference and Prediction, Springer. New York.

21. Huang, J., Liu, N., Pourahmadi, M. & Liu, L. (2006), 'Covariance matrix selection and estimation via penalized normal likelihood', Biometrika 93(1), 85-98.

22. Jagannathan, R. & Ma, T. (2003), 'Risk reduction in large portfolios: why imposing the wrong constraints helps', Journal of Finance 58(4), 1651-1683.

23. Johnstone, I. (2001), 'On the distribution of the largest eigenvalue in principal components analysis', The Annals of Statistics 29(2), 295-327.

24. Lam, C. & Yao, Q. (2010), Estimation for latent factor models for high-dimensional time series, Discussion Paper , Department of Statistics, London School of Economics and Political Science.

25. Lam, L., Fung, L. & Yu, I. (2009), Forecasting a large dimensional covariance matrix of a portfolio of different asset classes, Discussion Paper 1, Research Department, Hong Kong Monetary Autority.

26. Ledoit, O. & Wolf, M. (2003), 'Improved estimation of the covariance matrix of stock returns with an application to portfolio selection', Journal of Empirical Finance 10(5), 603-621.

27. Ledoit, O. & Wolf, M. (2004), 'Honey, I shrunk the sample covariance matrix', Journal of Portfolio Management 30(4), 110-119.

28. Markowitz, H. (1952), 'Portfolio selecction', Journal of Finance 7(1), 77-91.

29. Morgan, J. P. (1996), Riskmetrics, Technical Report , J. P. Morgan/Reuters. New York.

30. Pan, J. & Yao, Q. (2008), 'Modelling multiple time series via common factors', Boimetrika 95(2), 365-379.

31. Peña, D. & Box, G. (1987), 'Identifying a simplifying structure in time series', Journal of American Statistical Association 82(399), 836-843.

32. Peña, D. & Poncela, P. (2006), 'Nonstationary dynamic factor analysis', Journal of Statistics Planing and Inference 136, 1237-1257.

33. Stein, C. (1956), Inadmissibility of the usual estimator for the mean of a multivariate normal distribution, 'Proceedings of the Third Berkeley Symposium on Mathematical and Statistical Probability', Vol. 1, University of California, p. 197-206. Berkeley.

34. Tibshirani, R. (1996), 'Regression shrinkage and selection via the Lasso', The Journal of Royal Statistical Society, Series B 58(1), 267-288.

35. Voev, V. (2008), Dynamic Modelling of Large Dimensional Covariance Matrices, High Frequency Financial Econometrics, Springer-Verlag. Berlin.

36. Wang, Y. & Zou, J. (2009), 'Vast volatility matrix estimation for high-frequency financial data', The Annals of Statistics 38(2), 943-978.

37. Wu, W. & Pourahmadi, M. (2003), 'Nonparametric estimation of large covariance matrices of longitudinal data', Biometrika 90(4), 831-844.

38. Zheng, X. & Li, Y. (2010), On the estimation of integrated covariance matrices of high dimensional diffusion processes, Discusion paper , Business Statistics and Operations Management, Hong Kong University of Science and Technology.


[Recibido en septiembre de 2010. Aceptado en marzo de 2011]

Este artículo se puede citar en LaTeX utilizando la siguiente referencia bibliográfica de BibTeX:

@ARTICLE{RCEv34n3a09,
    AUTHOR  = {Gómez, Karoll and Gallón, Santiago},
    TITLE   = {{Comparison among High Dimensional Covariance Matrix Estimation Methods}},
    JOURNAL = {Revista Colombiana de Estadística},
    YEAR    = {2011},
    volume  = {34},
    number  = {3},
    pages   = {567-588}
}