A Unified Approach to Link Prediction in Collaboration Networks

Juan Sosa; Diego Martínez; Nicolás Guerrero

doi:10.15446/rce.v48n2.117558

Published

2025-07-01

A Unified Approach to Link Prediction in Collaboration Networks

Un enfoque unificado para la predicción de enlaces en redes de colaboración

DOI:

https://doi.org/10.15446/rce.v48n2.117558

Keywords:

Collaboration networks, Exponential random graph model, Graph convolutional network, Word2Vec, Social networks analysis. (en)
Redes de colaboración, Modelo exponencial de grafos aleatorios, Red de convolución sobre grafos, Word2Vec, Análisis de redes sociales. (es)

Downloads

PDF

Authors

Juan Sosa Universidad Nacional de Colombia
Diego Martínez Universidad Nacional de Colombia
Nicolás Guerrero Universidad Nacional de Colombia

Abstract (en)
Abstract (es)

This article investigates and compares three approaches to link prediction in colaboration networks, namely, an ERGM (Exponential Random Graph Model; Robins et al. 2007), a GCN (Graph Convolutional Network; Kipf & Welling 2017), and a Word2Vec+MLP model (Word2Vec model combined with a multilayer neural network; Mikolov, Chen, Corrado & Dean 2013 and Goodfellow et al. 2016). The ERGM, grounded in statistical methods, is employed to capture general structural patterns within the network, while the GCN andWord2Vec+MLP models leverage deep learning techniques to learn adaptive structural representations of nodes and their relationships. The predictive performance of the models is assessed through extensive simulation exercises using cross-validation, with metrics based on the receiver operating characteristic curve. The results clearly show the superiority of machine learning approaches in link prediction, particularly in large networks, where traditional models such as ERGM exhibit limitations in scalability and the ability to capture inherent complexities. These findings highlight the potential benefits of integrating statistical modeling techniques with deep learning methods to analyze complex networks, providing a more robust and effective framework for future research in this field.

Este artículo investiga y compara tres enfoques para la predicción de enlaces en redes de colaboración: un ERGM (Exponential Random Graph Model; Robins et al., 2007), una GCN (Graph Convolutional Network; Kipf & Welling, 2017) y un modelo Word2Vec+MLP (modelo Word2Vec combinado con una red neuronal multicapa; Mikolov, Chen, Corrado & Dean (2013), y Goodfellow et al. (2016)). El ERGM, basado en métodos estadísticos, se emplea para capturar patrones estructurales generales dentro de la red, mientras que los modelos GCN y Word2Vec+MLP utilizan técnicas de aprendizaje profundo para aprender representaciones estructurales adaptativas de los nodos y sus relaciones. El desempeño predictivo de los modelos se evalúa mediante extensos ejercicios de simulación con validación cruzada, utilizando métricas basadas en la curva característica operativa del receptor (ROC). Los resultados muestran claramente la superioridad de los enfoques de aprendizaje automático en la predicción de enlaces, particularmente en redes grandes, donde los modelos tradicionales como el ERGM presentan limitaciones en escalabilidad y en la capacidad de capturar complejidades inherentes. Estos hallazgos resaltan los posibles beneficios de integrar técnicas de modelado estadístico con métodos de aprendizaje profundo para analizar redes complejas, proporcionando un marco más robusto y efectivo para futuras investigaciones en este campo.

References

Amarasinghe, S. et al. (2024), Explainable Artificial Intelligence: Second World Conference, xAI 2024, Springer. https://www.springer.com/

Chiang, W.-L., Liu, X., Si, S., Li, Y., Bengio, S. & Hsieh, C.-J. (2019), Clustergcn: An efficient algorithm for training deep and large graph convolutional networks, in 'Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining', KDD '19, Association for Computing Machinery, New York, NY, USA, p. 257-266. DOI: https://doi.org/10.1145/3292500.3330925

Davis, J. & Goadrich, M. (2006), The relationship between precision-recall and roc curves, in 'Proceedings of the 23rd International Conference on Machine Learning', ACM, pp. 233-240. DOI: https://doi.org/10.1145/1143844.1143874

Duvenaud, D., Maclaurin, D., Aguilera-Iparraguirre, J., Gómez-Bombarelli, R., Hirzel, T., Aspuru-Guzik, A. & Adams, R. P. (2015), 'Convolutional Networks on Graphs for Learning Molecular Fingerprints'.

Erdos, P. & Rényi, A. (1960), 'On the evolution of random graphs', Publications of the Mathematical Institute of the Hungarian Academy of Sciences 5, 17-61.

Fawcett, T. (2006), 'An introduction to roc analysis', Pattern Recognition Letters 27(8), 861-874. DOI: https://doi.org/10.1016/j.patrec.2005.10.010

Gamerman, D. & Lopes, H. F. (2006), Markov Chain Monte Carlo: Stochastic Simulation for Bayesian Inference, 2 edn, Chapman and Hall/CRC. DOI: https://doi.org/10.1201/9781482296426

Goodfellow, I., Bengio, Y. & Courville, A. (2016), Deep Learning, MIT Press. http://www.deeplearningbook.org

Grover, A. & Leskovec, J. (2016), node2vec: Scalable feature learning for networks, in 'Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining', pp. 855-864. DOI: https://doi.org/10.1145/2939672.2939754

Hamilton, W. L., Ying, R. & Leskovec, J. (2017a), Inductive representation learning on large graphs, in 'Advances in Neural Information Processing Systems (NeurIPS)'.

Hamilton, W. L., Ying, R. & Leskovec, J. (2017b), 'Representation learning on graphs: Methods and applications', IEEE Data Engineering Bulletin 40(3), 52-74.

Handcock, M., Hunter, D., Butts, C., Goodreau, S. & Morris, M. (2008), 'Statnet: Software tools for the representation, visualization, analysis and simulation of network data', Journal of statistical software 24, 1548-7660. DOI: https://doi.org/10.18637/jss.v024.i01

Hoff, P. (2007), 'Modeling homophily and stochastic equivalence in symmetric relational data', Advances in neural information processing systems 20.

Hoff, P. D., Raftery, A. E. & Handcock, M. S. (2002), 'Latent space approaches to social network analysis', Journal of the american Statistical association 97(460), 1090-1098. DOI: https://doi.org/10.1198/016214502388618906

Kipf, T. N. & Welling, M. (2017), Semi-supervised classification with graph convolutional networks, in 'Proceedings of the International Conference on Learning Representations (ICLR)'.

Kivelä, M., Arenas, A., Barthelemy, M., Gleeson, J. P., Moreno, Y. & Porter, M. A. (2014), 'Multilayer networks', Journal of Complex Networks 2(3), 203-271. DOI: https://doi.org/10.1093/comnet/cnu016

Kolaczyk, E. D. & Csárdi, G. (2020), Statistical analysis of network data with R, Use R!, 2nd ed edn, Springer, Cham. DOI: https://doi.org/10.1007/978-3-030-44129-6

Lee, Y., Lee, I. W. & Feiock, R. C. (2012), 'Interorganizational Collaboration Networks in Economic Development Policy: An Exponential Random Graph Model Analysis*', Policy Studies Journal 40(3), 547-573. DOI: https://doi.org/10.1111/j.1541-0072.2012.00464.x

Lu, L. & Zhou, T. (2011), 'Link prediction in complex networks: A survey', Physica A: Statistical Mechanics and Its Applications 390(6), 1150-1170. DOI: https://doi.org/10.1016/j.physa.2010.11.027

Luke, D. (2015), A User's Guide to Network Analysis in R, Use R!, Springer International Publishing, Cham. DOI: https://doi.org/10.1007/978-3-319-23883-8

Lusher, D., Koskinen, J. & Robins, G. (2013), Exponential Random Graph Models for Social Networks: Theory, Methods, and Applications, Cambridge University Press. DOI: https://doi.org/10.1017/CBO9780511894701

Mikolov, T., Chen, K., Corrado, G. & Dean, J. (2013), Efficient estimation of word representations in vector space, in 'Proceedings of the International Conference on Learning Representations (ICLR)'.

Mikolov, T., Yih, W.-t. & Zweig, G. (2013), 'Linguistic regularities in continuous space word representations', Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies pp. 746-751.

Newman, M. E. J. (2001), 'The structure of scientific collaboration networks', Proceedings of the National Academy of Sciences 98(2), 404-409. DOI: https://doi.org/10.1073/pnas.021544898

Newman, M. E. J., Strogatz, S. H. & Watts, D. J. (2001), 'Random graphs with arbitrary degree distributions and their applications', Physical Review E64(2), 026118. DOI: https://doi.org/10.1103/PhysRevE.64.026118

Pareja, A., Domeniconi, G., Chen, J., Ma, T., Suzumura, T., Kanezashi, H., Kaler, T., Schardl, T. B. & Leiserson, C. E. (2020), Evolvegcn: Evolving graph convolutional networks for dynamic graphs, in 'Proceedings of the AAAI Conference on Artificial Intelligence'. DOI: https://doi.org/10.1609/aaai.v34i04.5984

Perozzi, B., Al-Rfou, R. & Skiena, S. (2014), Deepwalk: Online learning of social representations, in 'Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining', pp. 701-710. DOI: https://doi.org/10.1145/2623330.2623732

Robins, G., Pattison, P., Kalish, Y. & Lusher, D. (2007), 'An introduction to exponential random graph (p*) models for social networks', Social Networks 29(2), 173-191. DOI: https://doi.org/10.1016/j.socnet.2006.08.002

Rossi, E., Kenlay, H., Gorinova, M., Bronstein, M. & Chamberlain, B. (2020), 'Temporal graph networks for deep learning on dynamic graphs', arXiv preprint arXiv:2006.10637.

Rumelhart, D. E. & McClelland, J. L. (1986), Parallel Distributed Processing: Explorations in the Microstructure of Cognition. Volume 1: Foundations, MIT Press, Cambridge, MA. DOI: https://doi.org/10.7551/mitpress/5236.001.0001

Salter-Townshend, M. & Murphy, T. B. (2013), 'Variational bayesian inference for the latent position cluster model for network data', Computational Statistics & Data Analysis 57(1), 661-671. DOI: https://doi.org/10.1016/j.csda.2012.08.004

Silge, J. & Robinson, D. (2017), Text Mining with R: A Tidy Approach, O'Reilly Media, Inc., Sebastopol, CA. https://www.oreilly.com/library/view/textminingwith/9781491981658/

Skiena, S. S. (2008), The Algorithm Design Manual, Springer London, London. DOI: https://doi.org/10.1007/978-1-84800-070-4

Skiena, S. S. (2017), The Data Science Design Manual, Texts in Computer Science, Springer International Publishing, Cham. DOI: https://doi.org/10.1007/978-3-319-55444-0

Skvoretz, J. (1990), 'Biased net theory: Approximations, simulations and observations', Social Networks 12(3), 217-238. DOI: https://doi.org/10.1016/0378-8733(90)90006-U

Snijders, T. A. B. (2002), 'Markov chain monte carlo estimation of exponential random graph models', Journal of Social Structure 3(2), 1-40.

Sokolova, M. & Lapalme, G. (2009), 'A systematic analysis of performance measures for classification tasks', Information Processing & Management 45(4), 427-437. DOI: https://doi.org/10.1016/j.ipm.2009.03.002

Sosa, J. & Buitrago, L. (2021), 'A review of latent space models for social networks', Revista Colombiana de Estadística 44(1), 171-200. DOI: https://doi.org/10.15446/rce.v44n1.89369

Strauss, D. & Ikeda, M. (1990), 'Pseudolikelihood estimation for social networks', Journal of the American Statistical Association 85(409), 204-212. DOI: https://doi.org/10.1080/01621459.1990.10475327

van der Maaten, L. & Hinton, G. (2008), 'Visualizing data using t-sne', Journal of Machine Learning Research 9(Nov), 2579-2605.

Wu, Z., Pan, S., Chen, F., Long, G., Zhang, C. & Philip, S. Y. (2021), 'A comprehensive survey on graph neural networks', IEEE Transactions on Neural Networks and Learning Systems 32(1), 4-24. DOI: https://doi.org/10.1109/TNNLS.2020.2978386

Xu, M. (2021), 'Understanding graph embedding methods and their applications', SIAM Review 63(4), 825-853. DOI: https://doi.org/10.1137/20M1386062

Yang, Z., Algesheimer, R. & Tessone, C. J. (2015), 'Evaluating link prediction methods', Knowledge-Based Systems 74, 87-96.

Yao, L., Mao, C. & Luo, Y. (2019), Graph convolutional networks for text classification, in 'Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence and Thirty-First Innovative Applications of Artificial Intelligence Conference and Ninth AAAI Symposium on Educational Advances in Artificial Intelligence'. DOI: https://doi.org/10.1609/aaai.v33i01.33017370

Ying, Z., Bourgeois, D., You, J., Zitnik, M. & Leskovec, J. (2019), Gnnexplainer: Generating explanations for graph neural networks, in 'Advances in Neural Information Processing Systems (NeurIPS)'.

You, Y., Chen, T., Sui, Y., Chen, T., Wang, Z. & Shen, Y. (2020), Graph contrastive learning with augmentations, in 'Advances in Neural Information Processing Systems (NeurIPS)'.

Zhang, Z., Cui, P. & Zhu, W. (2020), 'Deep learning on graphs: A survey', IEEE Transactions on Knowledge and Data Engineering 34(1), 249-270. DOI: https://doi.org/10.1109/TKDE.2020.2981333

How to Cite

APA

Sosa, J., Martínez, D. & Guerrero, N. (2025). A Unified Approach to Link Prediction in Collaboration Networks. Revista Colombiana de Estadística, 48(2), 115–137. https://doi.org/10.15446/rce.v48n2.117558

ACM

[1]

Sosa, J., Martínez, D. and Guerrero, N. 2025. A Unified Approach to Link Prediction in Collaboration Networks. Revista Colombiana de Estadística. 48, 2 (Jul. 2025), 115–137. DOI:https://doi.org/10.15446/rce.v48n2.117558.

ACS

(1)

Sosa, J.; Martínez, D.; Guerrero, N. A Unified Approach to Link Prediction in Collaboration Networks. Rev. colomb. estad. 2025, 48, 115-137.

ABNT

SOSA, J.; MARTÍNEZ, D.; GUERRERO, N. A Unified Approach to Link Prediction in Collaboration Networks. Revista Colombiana de Estadística, [S. l.], v. 48, n. 2, p. 115–137, 2025. DOI: 10.15446/rce.v48n2.117558. Disponível em: https://revistas.unal.edu.co/index.php/estad/article/view/117558. Acesso em: 4 mar. 2026.

Chicago

Sosa, Juan, Diego Martínez, and Nicolás Guerrero. 2025. “ A Unified Approach to Link Prediction in Collaboration Networks”. Revista Colombiana De Estadística 48 (2):115-37. https://doi.org/10.15446/rce.v48n2.117558.

Harvard

Sosa, J., Martínez, D. and Guerrero, N. (2025) “ A Unified Approach to Link Prediction in Collaboration Networks”, Revista Colombiana de Estadística, 48(2), pp. 115–137. doi: 10.15446/rce.v48n2.117558.

IEEE

[1]

J. Sosa, D. Martínez, and N. Guerrero, “ A Unified Approach to Link Prediction in Collaboration Networks”, Rev. colomb. estad., vol. 48, no. 2, pp. 115–137, Jul. 2025.

MLA

Sosa, J., D. Martínez, and N. Guerrero. “ A Unified Approach to Link Prediction in Collaboration Networks”. Revista Colombiana de Estadística, vol. 48, no. 2, July 2025, pp. 115-37, doi:10.15446/rce.v48n2.117558.

Turabian

Sosa, Juan, Diego Martínez, and Nicolás Guerrero. “ A Unified Approach to Link Prediction in Collaboration Networks”. Revista Colombiana de Estadística 48, no. 2 (July 8, 2025): 115–137. Accessed March 4, 2026. https://revistas.unal.edu.co/index.php/estad/article/view/117558.

Vancouver

1.

Sosa J, Martínez D, Guerrero N. A Unified Approach to Link Prediction in Collaboration Networks. Rev. colomb. estad. [Internet]. 2025 Jul. 8 [cited 2026 Mar. 4];48(2):115-37. Available from: https://revistas.unal.edu.co/index.php/estad/article/view/117558

Download Citation

CrossRef Cited-by

0

Dimensions

PlumX

Article abstract page views

300

Downloads

Download data is not yet available.

License

This work is licensed under a Creative Commons Attribution 4.0 International License.

Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).

	IBN Publindex
	El Índice Bibliográfico Nacional Publindex es un sistema colombiano para la clasificación, actualización, escalafonamiento y certificación de las publicaciones científicas y tecnológicas. Es regido por COLCIENCIAS y el ICFES en Colombia.
	SciELO Colombia
	SciELO Colombia es una librería virtual para América Latina, el Caribe, España y Portugal, fue creada por FAPESP en el año de 1997 en Sao Pablo Brasil, actualmente en Colombia es gestionada por la Universidad Nacional de Colombia.
	REDIB
	Portal donde se muestran las revistas electrónicas españolas y latinoamericanas de acceso abierto (Open Access). Fue creado en España.
	Scopus
	Scopus es una base de datos bibliográfica de resúmenes y citas de artículos de revistas científicas. Cubre aproximadamente 19.500 títulos de más de 5.000 editores internacionales, incluyendo la cobertura de de 16.500 revistas.
	Latindex
	Latindex es producto de la cooperación de una red de instituciones latinoamericanas que funcionan de manera coordinada para reunir y diseminar información bibliográfica sobre las publicaciones científicas seriadas producidas en la región.
	Dialnet
	Dialnet es un portal de difusión de la producción científica hispana que inició su funcionamiento en el año 2001 especializado en ciencias humanas y sociales. Su base de datos, de acceso libre, fue creada por la Universidad de La Rioja (España).
	Zentralblatt Math
	Zentralblatt MATH (zbMATH) es el servicio de resumen y revisión más completo y de más larga duración del mundo en matemática pura y aplicada. Está editado por la European Mathematical Society (EMS), la Academia de Ciencias y Humanidades de Heidelberg y FIZ Karlsruhe. El trabajo editorial lo realiza la oficina de Berlín de FIZ Karlsruhe que, como miembro de la Asociación Leibniz, es una empresa sin fines de lucro y una organización reconocida de interés público. zbMATH es distribuido por Springer Nature.

Revista Colombiana de Estadística

Published

A Unified Approach to Link Prediction in Collaboration Networks

Un enfoque unificado para la predicción de enlaces en redes de colaboración

DOI:

Keywords:

Downloads

Authors

References

How to Cite

APA

ACM

ACS

ABNT

Chicago

Harvard

IEEE

MLA

Turabian

Vancouver

Download Citation

CrossRef Cited-by

Dimensions

PlumX

Article abstract page views

Downloads

License

Make a Submission

Information for Authors

Scimago Journal & Country Rank (SJR)

Keywords