Publicado

2020-01-01

Modeling of apartment prices in a Colombian context from a machine learning approach with stable-important attributes

Modelización de precios de apartamentos en un contexto colombiano desde un enfoque machine learning con atributos estables-importantes

DOI:

https://doi.org/10.15446/dyna.v87n212.80202

Palabras clave:

machine learning, real estate, property prices, big data (en)
aprendizaje de máquinas, bienes raíces, precios inmobiliarios, datos masivos (es)

Autores/as

The objective of this work is to develop a machine learning model for online pricing of apartments in a Colombian context. This article addresses three aspects: i) it compares the predictive capacity of linear regression, regression trees, random forest and bagging; ii) it studies the effect of a group of text attributes on the predictive capability of the models; and iii) it identifies the more stable-important attributes and interprets them from an inferential perspective to better understand the object of study. The sample consists of 15,177 observations of real estate. The methods of assembly (random forest and bagging) show predictive superiority with respect to others. The attributes derived from the text had a significant relationship with the property price (on a log scale). However, their contribution to the predictive capacity was almost nil, since four different attributes achieved highly accurate predictions and remained stable when the sample change.

El objetivo es desarrollar un modelo de aprendizaje automático para precios de apartamentos en un contexto colombiano. Este artículo aborda tres aspectos: i) compara la capacidad predictiva de regresión lineal, árboles de regresión, random forest y bagging; ii) identifica los atributos estables-importantes y los interpreta desde una perspectiva inferencial para entender mejor el objeto de estudio. La muestra consta de 15.177 observaciones de inmuebles. Los métodos de ensamblaje (random forest y bagging) muestran una superioridad predictiva con respecto a los demás. Los atributos derivados del texto muestran una relación significativa con el precio de la propiedad (en escala logarítmica). Sin embargo, su contribución a la capacidad predictiva fue casi nula, ya que cuatro atributos diferentes lograron predicciones altamente precisas y se mantuvieron estables ante cambios en la muestra.

Referencias

Oladunni, T., & Sharma, S. Hedonic Housing Theory – A Machine Learning Investigation . 15th IEEE International Conference on Machine Learning and Applications, pp. 522-527, 2016.

Yoo, S., Im., J., & Wagner, J. Variable selection for hedonic model using machine learning approaches: A case study in Onondaga County, NY. Landscape and Urban Planning 107, pp. 293-306, 2012.

Mullainathan, S., & Spiess, J. Machine learning: an applied econometric approach. Journal of Economic Perspectives, 31(2), pp. 87-106, 2017.

Pérez-Rave, J. I., Correa-Morales, J. C., & González-Echavarría, F. A machine learning approach to big data regression analysis of real estate prices for inferential and predictive purposes. Journal of Property Research, pp. 1-38, 2019.

Banerjee, B., & Dutta, S. Predicting the Housing Price Direction using Machine Learning Tecniques. IEEE International Conference on Power, Control, Signals and Instrumentation Engineering, pp. 2998-3000, 2017.

Winson-Geideman, K., Krause, A., Lipscomb, C. A., & Evangelopoulos, N. Real Estate Analysis in the Information Age: Techniques for Big Data and Statistical Modeling. Routledge, 2017.

Baldominos, A., Blanco, I., Moreno, A., Iturrarte, R., Bernárdez, Ó., & Afonso, C. Identifying Real Estate Opportunities Using Machine Learning. Applied Sciences, 8(11), p. 2321, 2018.

Cateni, S., & Colla, V. Variable selection for efficient design of machine learning-based models. In: C. Jayne & L. Iliadis (Eds), Engineering applications of neural networks: 17th international conference, EANN 2016 (pp. 352–366). Aberdeen, UK, September 2–5, 2016.

Čeh, M., Kilibarda, M., Lisec, A., & Bajat, B. Estimating the Performance of Random Forest versus Multiple Regression for Predicting Prices of the Apartments. ISPRS International Journal of Geo-Information, 7(5), p. 168, 2018.

Abdallah, S., & Khashan, D. Using Text Mining to Analyze Real Estate Classifieds. In International Conference on Advanced Intelligent Systems and Informatics, Springer International Publishing, pp. 193-202, 2016.

Park, B., & Bae, J. K. Using machine learning algorithms for housing price prediction: The case of Fairfax County, Virginia housing data. Expert Systems with Applications, pp. 2928-2934, 2015.

Beręsewicz, M. E. On representativeness of Internet data sources for real estate market in Poland. Austrian Journal of Statistics, 44(2), pp. 45-57, 2015.

Pérez-Rave, J.I. Statihouse®: desarrollo tecnológico basado en ciencia de datos para explorar estadísticamente el sector inmobiliario. Ingeniare. Revista chilena de ingeniería, 27(1), pp. 113-130, 2019.

Cavallo, A. Are online and offline prices similar? evidence from large multi-channel retailers. American Economic Review, 107(1), pp. 283-303, 2017.

R Core Team. R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria, 2019. URL https://www.R-project.org/.

Varian, H. R. Big data: New tricks for econometrics. Journal of Economic Perspectives, 28(2), pp. 3–28, 2014.

Murdoch, J. C., & Thayer, M. A. Hedonic price estimation of variable urban air quality. Journal of Environmental Economics and Management, 15(2), pp. 143-146, 1988.

James, G., Witten, D., Hastie, T., & Tibshirani, R. An introduction to statistical learning, New York: springer, 2013.

Bin, J., Tang, S., Liu, Y., Wang, G., Gardiner, B., Liu, Z., & Li, E. Regression model for appraisal of real estate using recurrent neural network and boosting tree. Computational Intelligence and Applications (ICCIA), 2nd IEEE International Conference on (pp. 209–213). Beijing, China: IEEE, 2017.

Cai, J., Luo, J., Wang, S., & Yang, S. Feature selection in machine learning: A new perspective. Neurocomputing, 300, pp. 70–79, 2018.

Flórez, R., & Arias, N. Evaluación de conocimientos previos del aprendizaje inicial de lectura. Revista internacional de investigación en educación, 2(4), pp. 329–344, 2010.

Yang, J., Li, C., Li, Y., Xi, J., Ge, Q., & Li, X. Urban green space, uneven development and accessibility: a case of Dalian's Xigang District. Chinese geographical science, 25(5), pp. 644-656, 2015.

Herrán-Falla, O. F., Prada-Gómez, G. E., & Patiño-Benavidez, G. A. Canasta básica alimentaria e índice de precios en Santander, Colombia, 1999-2000. Salud Pública Mex, 45, pp. 35–42, 2003.

Tuñón, I., & Poy, S. Factores asociados a las calificaciones escolares como proxy del rendimiento educativo. Revista electrónica de investigación educativa, 18(1), pp. 98-111, 2016.

Clavijo, S., Janna, M., & Muñoz, S. La vivienda en Colombia: Sus determinantes socioeconómicos y financieros. Revista Desarrollo y Sociedad, (55), pp. 101–165, 2005.

Figueroa, E. Determinantes del precio de la vivienda en Santiago: Una estimación hedónica. Estudios de Economía, 19(1), pp. 67–84, 1992.

Dubin, R. Predicting house prices using multiple listings data. Journal of Real Estate Finance and Economics, 17(1), pp. 35–39, 1998.

Limsombunchai, V. House price prediction: Hedonic price model vs. artificial neural network. New Zealand Agricultural and Resource Economics Society Conference (pp. 25–26). Blenheim, New Zealand, 2004.

Pardoe, I. Modeling home prices using realtor data. Journal of Statistics Education, 16(2), 2008 read on. doi:10.1080/10691898.2008.11889569

Cómo citar

IEEE

[1]
J. I. Pérez Rave, F. González Echavarría, y J. C. Correa Morales, «Modeling of apartment prices in a Colombian context from a machine learning approach with stable-important attributes», DYNA, vol. 87, n.º 212, pp. 63–72, ene. 2020.

ACM

[1]
Pérez Rave, J.I., González Echavarría, F. y Correa Morales, J.C. 2020. Modeling of apartment prices in a Colombian context from a machine learning approach with stable-important attributes. DYNA. 87, 212 (ene. 2020), 63–72. DOI:https://doi.org/10.15446/dyna.v87n212.80202.

ACS

(1)
Pérez Rave, J. I.; González Echavarría, F.; Correa Morales, J. C. Modeling of apartment prices in a Colombian context from a machine learning approach with stable-important attributes. DYNA 2020, 87, 63-72.

APA

Pérez Rave, J. I., González Echavarría, F. & Correa Morales, J. C. (2020). Modeling of apartment prices in a Colombian context from a machine learning approach with stable-important attributes. DYNA, 87(212), 63–72. https://doi.org/10.15446/dyna.v87n212.80202

ABNT

PÉREZ RAVE, J. I.; GONZÁLEZ ECHAVARRÍA, F.; CORREA MORALES, J. C. Modeling of apartment prices in a Colombian context from a machine learning approach with stable-important attributes. DYNA, [S. l.], v. 87, n. 212, p. 63–72, 2020. DOI: 10.15446/dyna.v87n212.80202. Disponível em: https://revistas.unal.edu.co/index.php/dyna/article/view/80202. Acesso em: 15 mar. 2026.

Chicago

Pérez Rave, Jorge Iván, Favián González Echavarría, y Juan Carlos Correa Morales. 2020. «Modeling of apartment prices in a Colombian context from a machine learning approach with stable-important attributes». DYNA 87 (212):63-72. https://doi.org/10.15446/dyna.v87n212.80202.

Harvard

Pérez Rave, J. I., González Echavarría, F. y Correa Morales, J. C. (2020) «Modeling of apartment prices in a Colombian context from a machine learning approach with stable-important attributes», DYNA, 87(212), pp. 63–72. doi: 10.15446/dyna.v87n212.80202.

MLA

Pérez Rave, J. I., F. González Echavarría, y J. C. Correa Morales. «Modeling of apartment prices in a Colombian context from a machine learning approach with stable-important attributes». DYNA, vol. 87, n.º 212, enero de 2020, pp. 63-72, doi:10.15446/dyna.v87n212.80202.

Turabian

Pérez Rave, Jorge Iván, Favián González Echavarría, y Juan Carlos Correa Morales. «Modeling of apartment prices in a Colombian context from a machine learning approach with stable-important attributes». DYNA 87, no. 212 (enero 1, 2020): 63–72. Accedido marzo 15, 2026. https://revistas.unal.edu.co/index.php/dyna/article/view/80202.

Vancouver

1.
Pérez Rave JI, González Echavarría F, Correa Morales JC. Modeling of apartment prices in a Colombian context from a machine learning approach with stable-important attributes. DYNA [Internet]. 1 de enero de 2020 [citado 15 de marzo de 2026];87(212):63-72. Disponible en: https://revistas.unal.edu.co/index.php/dyna/article/view/80202

Descargar cita

CrossRef Cited-by

CrossRef citations4

1. Jack Cook, Sabrina Rotenberg. (2026). AI innovations, applications and recommendations in property management. Property Management, , p.1. https://doi.org/10.1108/PM-01-2025-0006.

2. Cihan ÇILGIN, Yılmaz GÖKŞEN, Hadi GÖKÇEN. (2023). The Effect of Outlier Detection Methods in Real Estate Valuation with Machine Learning. İzmir Sosyal Bilimler Dergisi, 5(1), p.9. https://doi.org/10.47899/ijss.1270433.

3. Krit Jaroensittichai, Kebfan Jittrong, Tosporn Arreeras, Ekkarat Singkhala, Pattaramon Vuttipittayamongkol. (2024). Predictive Modeling of Off-Campus Student Housing Rentals Using Machine Learning. 2024 International Conference on Decision Aid Sciences and Applications (DASA). , p.1. https://doi.org/10.1109/DASA63652.2024.10836318.

4. Quoc Anh Tran, Lanh Si Ho, Hiep Van Le, Indra Prakash, Binh Thai Pham. (2022). Estimation of the undrained shear strength of sensitive clays using optimized inference intelligence system. Neural Computing and Applications, 34(10), p.7835. https://doi.org/10.1007/s00521-022-06891-5.

Dimensions

PlumX

Visitas a la página del resumen del artículo

1087

Descargas

Los datos de descargas todavía no están disponibles.