Tools for causal inference from cross-sectional innovation surveys with continuous or discrete variables: Theory and applications

Alex Coad; Dominik Janzing; Paul Nightingale

doi:10.15446/cuad.econ.v37n75.69832

Publicado

2018-12-01

Tools for causal inference from cross-sectional innovation surveys with continuous or discrete variables: Theory and applications

Herramientas para la inferencia causal de encuestas de innovación de corte transversal con variables continuas o discretas: Teoría y aplicaciones

Ferramentas para a inferência causal de pesquisas de inovação de corte transversal com variáveis contínuas ou discretas: teoria e aplicações

DOI:

https://doi.org/10.15446/cuad.econ.v37n75.69832

Palabras clave:

Causal inference, innovation surveys, machine learning, additive noise models, directed acyclic graphs (en)
inferencia causal, encuestas de innovación, aprendizaje automático (machine learning), modelos de ruido aditivo, grafos acíclicos dirigidos (es)
inferência causal, pesquisas sobre inovação, aprendizado automático (machine learning), modelos de ruído aditivo, gráficos acíclicos dirigidos (pt)

Descargas

PDF (English)

Autores/as

Alex Coad Pontificia Universidad Católica del Perú
Dominik Janzing Causal Consulting
Paul Nightingale University of Sussex

This paper presents a new statistical toolkit by applying three techniques for data-driven causal inference from the machine learning community that are little-known among economists and innovation scholars: a conditional independence-based approach, additive noise models, and non-algorithmic inference by hand. We include three applications to CIS data to investigate public funding schemes for R&D investment, information sources for innovation, and innovation expenditures and firm growth. Preliminary results provide causal interpretations of some previously-observed correlations. Our statistical 'toolkit' could be a useful complement to existing techniques.

Este artículo presenta un nuevo conjunto de herramientas estadísticas al aplicar tres técnicas de inferencia causal basada en datos tomadas de la comunidad del aprendizaje automático (maching learning) y que son poco conocidas entre los economistas y los académicos de la innovación: un enfoque condicional basado en la independencia, modelos de ruido aditivo e inferencia no algorítmica a mano. Incluimos tres aplicaciones a los datos de la CIS —la encuesta de la comunidad sobre la innovación— para investigar los modelos de financiación pública para inversión en investigación y desarrollo, fuentes de información para la innovación, y gastos de innovación y crecimiento empresarial. Los resultados preliminares proporcionan interpretaciones causales de algunas correlaciones observadas previamente. Nuestro conjunto de herramientas estadísticas podría ser un complemento útil a las técnicas existentes.

Este artigo apresenta um novo conjunto de ferramentas estatísticas aplicando três técnicas de inferência causal baseadas em dados extraídos da comunidade de aprendizado automático (maching learning) e que são pouco conhecidas entre economistas e estudiosos da inovação: uma abordagem condicional baseada na independência, modelos aditivos de ruído e inferência não algorítmica à mão. Incluímos três aplicativos para os dados da CIS — a pesquisa da comunidade sobre inovação — para investigar os modelos de financiamento público para investimento em pesquisa e desenvolvimento, fontes de informação para inovação e gastos com inovação e crescimento de negócios. Os resultados preliminares fornecem interpretações causais de algumas correlações observadas anteriormente. Nosso conjunto de ferramentas estatísticas pode ser um complemento útil para as técnicas existentes.

Referencias

Aerts, K., & Schmidt, T. (2008). Two for the price of one?: Additionality effects of R&D subsidies: A comparison between Flanders and Germany. Research Policy, 37(5), 806-822.

Bryant, H. L., Bessler, D. A., & Haigh, M. S. (2009). Disproving causal relationships using observational data. Oxford Bulletin of Economics and Statistics, 71(3), 357-374.

Bloebaum, P., Janzing, D., Washio, T., Shimizu, S., & Schölkopf, B. (2018). Cause-Effect Inference by Comparing Regression Errors. Presented at AISTATS. For an extended version, see https://arxiv.org/abs/1802.06698.

Cassiman B., & Veugelers, R., (2002). R&D cooperation and spillovers: Some empirical evidence from Belgium. American Economic Review, 92(4), 1169-1184.

Budhathoki, K., & Vreeken, J. O. (2018). Causal inference by compression. Knowledge and Information Systems, 56(2), Springer. (IF 2.247).

Cattaruzzo, S. (2016). Novel tools for causal inference: A critical application to Spanish innovation studies. Supervisor: Alessio Moneta. University of Pisa/Sant’Anna School of Advanced Studies; Master’s Degree Thesis in Economics, November 2016.

Chesbrough, H. W. (2003). Open innovation: The new imperative for creating and profiting from technology. Cambridge, MA: Harvard Business Press.

Demiralp, S., & Hoover, K. (2003). Searching for the causal structure of a vector autoregression. Oxford Bulletin of Economics and Statistics , 65, 745-767.

Eurostat (2009). Work Session on Statistical Data Confidentiality, Manchester, 17-19 December 20. Office for Official Publications of the European Communities, Luxembourg, Retrieved April 12th, 2016. http://ec.europa.eu/eurostat/en/web/products-statistical-working-papers/-/KS-78-09-723

George, G., Haas, M. R., & Pentland, A. (2014). Big data and management. Academy of Management Journal, 57(2), 321-326.

Gretton, A., Bousquet, O., Smola, A., & Schölkopf, B. (2005a). Measuring statistical dependence with Hilbert-Schmidt norms. In Proceedings of the 16th Conference on Algorithmic Learning Theory, pages 63-77, Berlin: Springer-Verlag.

Gretton, A., Herbrich, R., Smola, A., Bousquet, O., & Schölkopf, B. (2005b). Kernel methods for measuring independence. Journal of Machine Learning Research, 6, 2075-2129.

Hall, B. H., & Jaffe A. B. (2012). Measuring science, technology, and innovation: A review. (Report prepared for the Panel on Developing Science, Technology, and Innovation Indicators for the Future, National Academies of Science. May 2012).

Hashi, I., & Stojčić, N. (2013). The impact of innovation activities on firm performance using a multi-stage model: Evidence from the Community Innovation Survey 4. Research Policy , 42(2), 353-366.

Heckman, J. J. (2010). Building bridges between structural and program evaluation approaches to evaluating policy. Journal of Economic Literature, 48(2), 356-398.

Heidenreich, M. (2009). Innovation patterns and location of European low- and medium-technology industries. Research Policy , 38(3), 483-494.

Howell, S. T. (2017). Financing innovation: Evidence from R&D grants. American Economic Review , 107(4), 1136-1164.

Hoyer, P., Janzing, D., Mooij, J., Peters, J., & Schölkopf, B. (2008). Nonlinear causal discovery with additive noise models. In D. Koller, D. Schuurmans, Y. Bengio, & L. Bottou (Eds.), Proceedings of the conference Neural Information Processing Systems (NIPS) 2008, Vancouver, Canada: MIT Press.

Hughes, A., & Mina, A. (2012). The UK R&D landscape. UK-IRC Report for the Enhancing Value Task Force, (March 2012).

Hussinger, K. (2008). R&D and subsidies at the firm level: An application of parametric and semiparametric two-step selection models. Journal of Applied Econometrics, 23, 729-747.

Hyvarinen, A., Shimizu, S., & Hoyer, P. O. (2008). Causal modelling combining instantaneous and lagged effects: An identifiable model based on non-Gaussianity. Presented in Proceedings of the 25th International Conference on Machine Learning (ICML2008), Helsinki, Finland (July 05 - 09, 2008).

Janzing, D., Peters, J., Mooij, J., & Schölkopf, B. (2009). Identifying confounders using additive noise models (Montreal, Quebec, Canada), in Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence (pp. 249-257). Arlington, Virginia, United States: AUAI Press.

Janzing, D., Sun, X., & Schölkopf, B. (2006). Causal inference by choosing graphs with most plausible Markov kernels. In Proceedings of the 9th International Symposium on Artifiicial Intelligence and Mathematics, pages 1-11, Fort Lauderdale, FL: Max-Planck-Gesellschaft

Janzing, D., Sun, X., & Schölkopf, B. (2009). Distinguishing cause and eﬀect via second order exponential models. Retreived from http://arxiv.org/abs/0910.5561.

Janzing, D., & Schölkopf, B. (2010). Causal inference using the algorithmic Markov condition. IEEE Transactions on Information Theory, 56(10), 5168-5194,

Janzing, D. (2016). Study on: Tools for causal inference from cross-sectional innovation surveys with continuous or discrete variables. European Commission - Joint Research Center. Available at: http://iri.jrc.ec.europa.eu/research-collaborations.html

Janzing, D., & Steudel, B. (2010). Justifying additive-noise-based causal discovery via algorithmic information theory. Open Systems and Information Dynamics, 17(2), 189-212.

Kwon, D. H., & Bessler, D. A. (2011). Graphical methods, inductive causal inference, and econometrics: A literature review. Computational Economics, 38(1), 85-106.

Lanne, M., Meitz, M., & Saikkonen, P. (2017). Identification and estimation of non-Gaussian structural vector autoregressions. Journal of Econometrics, 196(2), 288-304.

Laursen, K., & Salter, A. (2006). Open for innovation: the role of open-ness in explaining innovation performance among UK manufacturing firms. Strategic Management Journal, 27(2), 131-150.

Leiponen A., & Drejer I. (2007). What exactly are technological regimes? Intra-industry heterogeneity in the organization of innovation activities. Research Policy , 36, 1221-1238.

Lemeire, J., & Janzing, D. (2013). Replacing causal faithfulness with algorithmic independence of conditionals. Minds and Machines, 23(2), 227-249.

Mairesse, J., & Mohnen, P. (2010). Using innovation surveys for econometric analysis. In B. H. Hall & N. Rosenberg (Eds.), Handbook of the Economics of Innovation (Vol. 2, pp. 1129-1155), Amsterdam: North Holland.

Mani S., Cooper, G. F., & Spirtes, P. (2006). A theoretical study of Y structures for causal discovery. In Proceedings of the 22nd Annual Conference on Uncertainty in Artificial Intelligence (UAI), pp. 314-323.

Moneta, A. (2008). Graphical causal models and VARs: An empirical assessment of the real business cycles hypothesis. Empirical Economics, 35, 275-300.

Moneta, A., Entner, D., Hoyer, P., & Coad, A. (2013). Causal inference by independent component analysis: Theory and applications. Oxford Bulletin of Economics and Statistics , 75(5), 705-730.

Mooij, J. M., Peters, J., Janzing, D., Zscheischler, J., & Schölkopf, B. (2016). Distinguishing cause from effect using observational data: Methods and benchmarks. Journal of Machine Learning Research , 17(32), 1-102.

Mullainathan S., & Spiess J. (2017). Machine learning: An applied econometric approach. Journal of Economic Perspectives, 31(2), 87-106.

Pearl, J. (2009). Causality: Models, reasoning and inference (2nd ed.). Cambridge: Cambridge University Press.

Perez, S., & Siegler, M. (2006). Agricultural and monetary shocks before the great depression: A graph-theoretic causal investigation. Journal of Macroeconomics, 28(4), 720-736.

Peters, J., Janzing, D., & Schölkopf, B. (2011). Causal inference on discrete data using additive noise models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 33(12), 2436-2450.

Peters, J., Janzing, D., & Schölkopf, B. (2017). Elements of causal inference: Foundations and learning algorithms, Cambridge, MA: MIT press.

Reichenbach, H. (1956). The direction of time. Berkeley: University of California Press.

Schimel, J. (2012). Writing science: how to write papers that get cited and proposals that get funded. Oxford, UK: Oxford University Press.

Shimizu, S., Hoyer, P., Hyvarinen, A., & Kerminen, A. (2006). A linear non-Gaussian acyclic model for causal discovery. Journal of Machine Learning Research , 7, 2003-2030.

Shimizu S. (2014). LiNGAM: Non-Gaussian methods for estimating causal structures. Behaviormetrika, 41(1), 65-98.

Spirtes, P., Glymour, C. N., & Scheines, R. (2000). Causation, prediction, and search (2nd ed.). Cambridge, MA: MIT press .

Srholec, M., & Verspagen, B. (2012). The Voyage of the Beagle into innovation: explorations on heterogeneity, selection, and sectors. Industrial and Corporate Change, 21(5): 1221-1253.

Swanson, N. R., & Granger, C. W. J. (1997). Impulse response functions based on a causal approach to residual orthogonalization in vector autoregressions. Journal of the American Statistical Association, 92(437), 357-367.

Varian, H. R. (2014). Big data: New tricks for econometrics. Journal of Economic Perspectives, 28(2), 3-28.

Vega-Jurado, J., Gutiérrez-Gracia, A., & Fernández-de-Lucio, I. (2009). Does external knowledge sourcing matter for innovation? Evidence from the Spanish manufacturing industry. Industrial and Corporate Change , 18(4), 637-670.

Wallsten, S. J. (2000). The effects of government-industry R&D programs on private R&D: The case of the Small Business Innovation Research program. Rand Journal of Economics, 31(1), 82-100.

Xu, X. (2017). Contemporaneous causal orderings of US corn cash prices through directed acyclic graphs. Empirical Economics, 52(2), 731-758.

Yam, R. C., Lo, W., Tang, E. P., & Lau, A. K. (2011). Analysis of sources of innovation, technological innovation capabilities, and performance: An empirical study of Hong Kong manufacturing industries. Research Policy , 40(3), 391-402.

Cómo citar

APA

Coad, A., Janzing, D. y Nightingale, P. (2018). Tools for causal inference from cross-sectional innovation surveys with continuous or discrete variables: Theory and applications. Cuadernos de Economía, 37(75), 779–808. https://doi.org/10.15446/cuad.econ.v37n75.69832

ACM

[1]

Coad, A., Janzing, D. y Nightingale, P. 2018. Tools for causal inference from cross-sectional innovation surveys with continuous or discrete variables: Theory and applications. Cuadernos de Economía. 37, 75 (dic. 2018), 779–808. DOI:https://doi.org/10.15446/cuad.econ.v37n75.69832.

ACS

(1)

Coad, A.; Janzing, D.; Nightingale, P. Tools for causal inference from cross-sectional innovation surveys with continuous or discrete variables: Theory and applications. Cuad. econ 2018, 37, 779-808.

ABNT

COAD, A.; JANZING, D.; NIGHTINGALE, P. Tools for causal inference from cross-sectional innovation surveys with continuous or discrete variables: Theory and applications. Cuadernos de Economía, [S. l.], v. 37, n. 75, p. 779–808, 2018. DOI: 10.15446/cuad.econ.v37n75.69832. Disponível em: https://revistas.unal.edu.co/index.php/ceconomia/article/view/69832. Acesso em: 22 ene. 2025.

Chicago

Coad, Alex, Dominik Janzing, y Paul Nightingale. 2018. «Tools for causal inference from cross-sectional innovation surveys with continuous or discrete variables: Theory and applications». Cuadernos De Economía 37 (75):779-808. https://doi.org/10.15446/cuad.econ.v37n75.69832.

Harvard

Coad, A., Janzing, D. y Nightingale, P. (2018) «Tools for causal inference from cross-sectional innovation surveys with continuous or discrete variables: Theory and applications», Cuadernos de Economía, 37(75), pp. 779–808. doi: 10.15446/cuad.econ.v37n75.69832.

IEEE

[1]

A. Coad, D. Janzing, y P. Nightingale, «Tools for causal inference from cross-sectional innovation surveys with continuous or discrete variables: Theory and applications», Cuad. econ, vol. 37, n.º 75, pp. 779–808, dic. 2018.

MLA

Coad, A., D. Janzing, y P. Nightingale. «Tools for causal inference from cross-sectional innovation surveys with continuous or discrete variables: Theory and applications». Cuadernos de Economía, vol. 37, n.º 75, diciembre de 2018, pp. 779-08, doi:10.15446/cuad.econ.v37n75.69832.

Turabian

Coad, Alex, Dominik Janzing, y Paul Nightingale. «Tools for causal inference from cross-sectional innovation surveys with continuous or discrete variables: Theory and applications». Cuadernos de Economía 37, no. 75 (diciembre 1, 2018): 779–808. Accedido enero 22, 2025. https://revistas.unal.edu.co/index.php/ceconomia/article/view/69832.

Vancouver

1.

Coad A, Janzing D, Nightingale P. Tools for causal inference from cross-sectional innovation surveys with continuous or discrete variables: Theory and applications. Cuad. econ [Internet]. 1 de diciembre de 2018 [citado 22 de enero de 2025];37(75):779-808. Disponible en: https://revistas.unal.edu.co/index.php/ceconomia/article/view/69832

Descargar cita

CrossRef Cited-by

10

1. Bingzi Jin, Xiaojie Xu. (2024). Palladium Price Predictions via Machine Learning. Materials Circular Economy, 6(1) https://doi.org/10.1007/s42824-024-00123-y.

2. Bingzi Jin, Xiaojie Xu. (2024). Machine learning Brent crude oil price forecasts. Innovation and Emerging Technologies, 11 https://doi.org/10.1142/S2737599424500130.

3. Bingzi Jin, Xiaojie Xu. (2024). Steel price index forecasts through machine learning for northwest China. Mineral Economics, https://doi.org/10.1007/s13563-024-00483-6.

4. Jacob Rubæk Holm, Edward Lorenz. (2022). The impact of artificial intelligence on skills at work in Denmark. New Technology, Work and Employment, 37(1), p.79. https://doi.org/10.1111/ntwe.12215.

5. Bingzi Jin, Xiaojie Xu. (2024). Regional steel price index predictions for the southwest Chinese market through machine learning. Ironmaking & Steelmaking: Processes, Products and Applications, https://doi.org/10.1177/03019233241297720.

6. Bingzi Jin, Xiaojie Xu. (2024). Machine learning price index forecasts of flat steel products. Mineral Economics, https://doi.org/10.1007/s13563-024-00457-8.

7. Bingzi Jin, Xiaojie Xu. (2024). Forecasts of China Mainland New Energy Index Prices through Gaussian Process Regressions. Journal of Clean Energy and Energy Storage, 01 https://doi.org/10.1142/S2811034X24500060.

8. Marco Cucculelli, Valentina Peruzzi. (2020). Innovation over the industry life-cycle. Does ownership matter?. Research Policy, 49(1), p.103878. https://doi.org/10.1016/j.respol.2019.103878.

9. Mohammadsaleh Saadatmand, Tugrul Daim, Carlos Mena, Haydar Yalcin, Gulin Bolatan, Manali Chatterjee. (2025). An Evaluation Framework for Machine Learning and Data Science-Based Financial Strategies: A Case Study-Driven Decision Model. IEEE Transactions on Engineering Management, 72, p.349. https://doi.org/10.1109/TEM.2024.3522313.

10. Bingzi Jin, Xiaojie Xu. (2024). Predictions of steel price indices through machine learning for the regional northeast Chinese market. Neural Computing and Applications, 36(33), p.20863. https://doi.org/10.1007/s00521-024-10270-7.

Dimensions

PlumX

Visitas a la página del resumen del artículo

444

Descargas

Los datos de descargas todavía no están disponibles.

Licencia

Derechos de autor 2018 Cuadernos de Economía

Cuadernos de Economía a través de la División de Bibliotecas de la Universidad Nacional de Colombia promueve y garantiza el acceso abierto de todos sus contenidos. Los artículos publicados por la revista se encuentran disponibles globalmente con acceso abierto y licenciados bajo los términos de Creative Commons Atribución-No_Comercial-Sin_Derivadas 4.0 Internacional (CC BY-NC-ND 4.0), lo que implica lo siguiente: