Published
An Actionable Learning Path-based Model to Predict and Describe Academic Dropout
Un modelo accionable basado en el camino de aprendizaje para predecir y describir la deserción académica
DOI:
https://doi.org/10.15446/ing.investig.109389Keywords:
academic trajectory, student model, dropout, explainability, curriculum analysis (en)trayectoria academica, modelo de estudiantes, desercion, explicabilidad, analisis curricular (es)
Downloads
The prediction and explainability of student dropout in degree programs is an important issue, as it impacts students, families, and institutions. Nevertheless, the main efforts in this regard have focused on predictive power, even though explainability is more relevant to decision-makers. The objectives of this work were to propose a novel explainability model to predict dropout, to analyze its descriptive power to provide explanations regarding key configurations in academic trajectories, and to compare the model against other well-known approaches in the literature, including the analysis of the key factors in student dropout. To this effect, academic data from a Computer Science Engineering program was used, as well as three models: (i) a traditional model based on overall indicators of student performance, (ii) a normalized model with overall indicators separated by semester, and (iii) a novel configuration model, which considered the students’ performance in specific sets of courses. The results showed that the configuration model, despite not being the most powerful, could provide accurate early predictions, as well as actionable information through the discovery of critical configurations, which could be considered by program directors could consider when counseling students and designing curricula. Furthermore, it was found that the average grade and rate of passed courses were the most relevant variables in the literature-reported models, and that they could characterize configurations. Finally, it is noteworthy that the development of this new method can be very useful for making predictions, and that it can provide new insights when analyzing curricula and and making better counseling and innovation decisions.
La prediccion y explicabilidad de la desercion estudiantil en programas academicos es un asunto importante, pues impacta a estudiantes, familias e instituciones. Sin embargo, los principales esfuerzos en este sentido se han centrado en el poder predictivo, aunque la explicabilidad es mas relevante para los tomadores de decisiones. Los objetivos de este trabajo fueron proponer un modelo novedoso de explicabilidad para predecir la desercion, analizar su poder descriptivo para proporcionar explicaciones sobre configuraciones clave en trayectorias academicas y comparar el modelo con otros enfoques bien conocidos en la literatura, incluyendo el analisis de los factores clave en la desercion estudiantil. Para ello, se utilizaron datos academicos de un programa de Ingenierıa en Informatica, ası como tres modelos: (i) un modelo tradicional basado en indicadores generales de rendimiento estudiantil, (ii) un modelo normalizado con indicadores generales separados por semestre y (iii) un modelo de configuracion novedoso que considera el rendimiento de los estudiantes en conjuntos especıficos de cursos. Los resultados mostraron que el modelo de configuracion, a pesar de no ser el mas poderoso, podrıa proporcionar predicciones tempranas precisas, ası como informacion accionable a traves del descubrimiento de configuraciones crıticas, las cuales podrıan ser consideradas por los directores de programa al asesorar a los estudiantes y diseñar planes de estudio. Ademas, se encontro que la nota promedio y la tasa de cursos aprobados fueron las variables mas relevantes en los modelos reportados en la literatura, y que estas podrıan caracterizar configuraciones. Finalmente, es notable que el desarrollo de este nuevo metodo puede ser muy util para hacer predicciones y que puede proporcionar nuevas perspectivas al analizar planes de estudio y al tomar mejores decisiones de asesoramiento e innovacion.
References
Berens, J., Schneider, K., Gortz, S., Oster, S., and Burghoff, J. (2018). Early detection of students at risk–predicting student dropouts using administrative student data and machine learning methods. CESifo Working Paper. Retrieved from https://ssrn.com/abstract=3275433 DOI: https://doi.org/10.2139/ssrn.3275433
Bersimis, F. G., and Varlamis, I. (2019). Use of health-related indices and classification methods in medical data. In N. Dey (Ed.), Classification techniques for medical image analysis and computer aided diagnosis (pp. 31–66). Elsevier. https://doi.org/10.1016/B978-0-12-818004-4.00002-9.
Bottcher, A., Thurner, V., and Hafner, T. (2020, April 27-30). Applying data analysis to identify early indicators for potential risk of dropout in cs students. [Conference paper]. 2020 IEEE Global Engineering Education Conference, Porto, Portugal. https://doi.org/10.1109/EDUCON45650.2020.9125378.
Breier, M. (2010). Student retention and graduate destination: Higher education and labour market access and success. HSRC Press Cape Town.
Carvajal, C. M., González, J. A., and Sarzoza, S. J. (2018). Variables sociodemográficas y académicas explicativas de la deserción de estudiantes en la facultad de ciencias naturales de la universidad de playa ancha (chile). Formación universitaria, 11(2), 3–12. http://dx.doi.org/10.4067/S0718-50062018000200003.
Chung, J. Y., and Lee, S. (2019). Dropout early warning systems for high school students using machine learning. Children and Youth Services Review, 96, 346–353. https://doi.org/10.1016/j.childyouth.2018.11.030.
Dekker, G. W., Pechenizkiy, M., and Vleeshouwers, J. M. (2009, July 1-3). Predicting students drop out: A case study. [Conference paper]. 2nd International Conference on Educational Data Mining, Córdoba, Spain. https://www.educationaldatamining.org/EDM2009/uploads/proceedings/dekker.pdf.
Delen, D. (2011). Predicting Student Attrition with data mining methods. Journal of College Student Retention: Research, Theory & Practice, 13(1), 17–35. https://doi.org/10.2190/CS.13.1.b.
Denley, T. (2014). How predictive analytics and choice architecture can improve student success. Research & Practice in Assessment, 9, 61–69.
Donoso, S., Donoso, G., and Frites, C. (2013). La experiencia chilena de retención de estudiantes en la universidad. Revista Ciencia y Cultura, 17(30), 141–171.
Figueiredo, J., and Garcı́a-Peñalvo, F. (2021, October 26-29). A tool help for introductory programming courses. [Conference paper]. 9th International Conference on Technological Ecosystems for Enhancing Multiculturality, Barcelona, Spain. https://doi.org/10.1145/3486011.3486413.
Gardner, J., and Brooks, C. (2018). Student success prediction in moocs. User Modeling and User-Adapted Interaction, 28, 127–203. https://doi.org/10.1007/s11257-018-9203-z.
Gašević, D., Dawson, S., Rogers, T., and Gasevic, D. (2016). Learning analytics should not promote one size fits all: The effects of instructional conditions in predicting academic success. The Internet and Higher Education, 28, 68–84. https://doi.org/10.1016/j.iheduc.2015.10.002.
Guerra, J., Scheihing, E., Henrı́quez, V., Olivares-Rodrı́guez, C., and Chevreux, H. (2019, September 16-19). TrAC: Visualizing students academic trajectories. [Conference paper]. 14th European Conference on Technology Enhanced Learning, Delft, The Netherlands. https://doi.org/10.1007/978-3-030-29736-7_84.
Hutt, S., Gardener, M., Kamentz, D., Duckworth, A. L., and D’Mello, S. K. (2018, March 7-9). Prospectively predicting 4-year college graduation from student applications. [Conference paper]. 8th International Conference on Learning Analytics and Knowledge, Sydney, New South Wales, Australia. https://doi.org/10.1145/3170358.3170395.
Jeni, L. A., Cohn, J. F., and De La Torre, F. (2013, September 2-5). Facing imbalanced data–recommendations for the use of performance metrics. [Conference paper]. 2013 Humaine Association Conference on Affective Computing and Intelligent Interaction, Geneva, Switzerland. https://doi.org/10.1109/ACII.2013.47.
Jiang, F., and Li, W. (2017). Who will be the next to drop out? anticipating dropouts in moocs with multi-view features. International Journal of Performability Engineering, 13(2), 201–210. https://doi.org/10.23940/ijpe.17.2.p201.mag.Jin, C. (2021). Dropout prediction model in MOOC based on clickstream data and student sample weight. Soft Computing, 25, 8971–8988. https://doi.org/10.1007/s00500-021-05795-1.
Kang, K., and Wang, S. (2018, March 23-25). Analyze and predict student dropout from online programs. [Conference paper]. 2nd International Conference on Compute and Data Analysis, DeKalb, Illinois, USA. https://doi.org/10.1145/3193077.3193090.
Kohavi, R., and John, G. H. (1997). Wrappers for feature subset selection. Artificial Intelligence, 97(1-2), 273–324. https://doi.org/10.1016/S0004-3702(97)00043-X.
Lázaro Alvarez, N., Callejas, Z., and Griol, D. (2020). Predicting computer engineering students’ dropout in cuban higher education with pre-enrollment and early performance data. JOTSE: Journal of Technology and Science Education, 10(2), 241–258. https://doi.org/10.3926/jotse.922.
Li, I. W., and Carroll, D. R. (2020). Factors influencing dropout and academic performance: an australian higher education equity perspective. Journal of Higher Education Policy and Management, 42(1), 14–30. https://doi.org/10.1080/1360080X.2019.1649993.
Louppe, G., Wehenkel, L., Sutera, A., and Geurts, P. (2013, December 5-10). Understanding variable importances in forests of randomized trees (Vol. 1). [Conference paper]. 26th International Conference on Neural Information Processing Systems, Lake Tahoe, Nevada, USA.
Moreno-Marcos, P. M., Alario-Hoyos, C., Muñoz-Merino, P. J., and Kloos, C. D. (2018). Prediction in moocs: A review and future research directions. IEEE transactions on Learning Technologies, 12(3), 384–401. https://doi.org/10.1109/TLT.2018.2856808.
Moreno-Marcos, P. M., De Laet, T., Muñoz-Merino, P. J., Van Soom, C., Broos, T., Verbert, K., and Delgado Kloos, C. (2019). Generalizing predictive models of admission test success based on online interactions. Sustainability, 11(18), 4940. https://doi.org/10.3390/su11184940.
Mubarak, A. A., Cao, H., and Hezam, I. M. (2021). Deep analytic model for student dropout prediction in massive open online courses. Computers & Electrical Engineering, 93, 107271. https://doi.org/10.1016/j.compeleceng.2021.107271.
Muñoz-Merino, P. J., Kloos, C. D., Tsai, Y.-S., Gasevic, D., Verbert, K., Pérez-Sanagustı́n, M., . . . Scheihing, E. (2020, September 14-15). An overview of the LALA project. [Conference paper]. Workshop on Adoption, Adaptation and Pilots of Learning Analytics in Under represented Regions co-located with the 15th European Conference on Technology Enhanced Learning 2020, Online. https://ceur-ws.org/Vol-2704/invited1.pdf.
Nagy, M., and Molontay, R. (2023). Interpretable dropout prediction: Towards XAI-based personalized intervention. International Journal of Artificial Intelligence in Education, 1–27. https://doi.org/10.1007/s40593-023-00331-8.
Palacios, C. A., Reyes-Suárez, J. A., Bearzotti, L. A., Leiva, V., and Marchant, C. (2021). Knowledge discovery for higher education student retention based on data mining: Machine learning algorithms and case study in chile. Entropy, 23(4), 485. https://doi.org/10.3390/e23040485.
Panagiotakopoulos, T., Kotsiantis, S., Kostopoulos, G., Iatrellis, O., and Kameas, A. (2021). Early dropout prediction in MOOCs through supervised learning and hyperparameter optimization. Electronics, 10(14), 1701. https://doi.org/10.3390/electronics10141701.
Paura, L., and Arhipova, I. (2014). Cause analysis of students’ dropout rate in higher education study program. Procedia-Social and Behavioral Sciences, 109, 1282–1286. https://doi.org/10.1016/j.sbspro.2013.12.625.
Pelánek, R. (2015). Metrics for Evaluation of Student Models. Journal of Educational Data Mining, 7(2), 1–19.
Quadri, M., and Kalyankar, D. N. (2010). Drop out feature of student data for academic performance using decision tree techniques. Global Journal of Computer Science and Technology, 10(2), 2–5.
Radovanović, S., Delibašić, B., and Suknović, M. (2021). Predicting dropout in online learning environments. Computer Science and Information Systems, 18(3), 957–978. https://doi.org/10.2298/CSIS200920053R.
Rzepka, N., Simbeck, K., Muller, H.-G., and Pinkwart, N. (2022, April 22-24). Keep it up: In-session dropout prediction to support blended classroom scenarios. [Conference paper]. 14th International Conference on Computer Supported Education, Online. https://doi.org/10.5220/0010969000003182.
Smith, J. P., and Naylor, R. A. (2001). Dropping out of university: A statistical analysis of the probability of withdrawal for UK university students. Journal of the Royal Statistical Society: Series A (Statistics in Society), 164(2), 389–405. https://doi.org/10.1111/1467-985X.00209.
Tekin, A. (2014). Early Prediction of students’ grade point averages at graduation: A data mining approach. Eurasian Journal of Educational Research, 54, 207–226. DOI: https://doi.org/10.14689/ejer.2014.54.12
Vidal, J., Gilar-Corbi, R., Pozo-Rico, T., Castejón, J.-L., and Sánchez-Almeida, T. (2022). Predictors of university attrition: Looking for an equitable and sustainable higher education. Sustainability, 14(17), 10994. https://doi.org/10.3390/su141710994.
Wagner, K., Merceron, A., and Sauer, P. (2020, March 23-27). Accuracy of a cross-program model for dropout prediction in higher education. [Conference paper]. 10th International Learning Analytics & Knowledge Conference, Frankfurt, Germany.
Yu, R., Lee, H., and Kizilcec, R. F. (2021, June 22-25). Should college dropout prediction models include protected attributes? [Conference paper]. 8th ACM Conference on Learning @ Scale, Virtual Event, Germany. https://doi.org/10.1145/3430895.3460139.
How to Cite
APA
ACM
ACS
ABNT
Chicago
Harvard
IEEE
MLA
Turabian
Vancouver
Download Citation
CrossRef Cited-by
Dimensions
PlumX
Article abstract page views
Downloads
Funding data
-
Erasmus+
Grant numbers 586120-EPP-1-2017-1-ES-EPPKA2-CBHE-JP -
Ministerio de Ciencia, Innovación y Universidades
Grant numbers TIN2017-85179-C3-1-R -
Ministerio de Ciencia, Innovación y Universidades
Grant numbers FPU016/00526
License
Copyright (c) 2024 Cristian Olivares-Rodríguez, Pedro Manuel Moreno-Marcos, Eliana Scheihing Garcia, Pedro J. Muñoz-Merino, Carlos Delgado-Kloos

This work is licensed under a Creative Commons Attribution 4.0 International License.
The authors or holders of the copyright for each article hereby confer exclusive, limited and free authorization on the Universidad Nacional de Colombia's journal Ingeniería e Investigación concerning the aforementioned article which, once it has been evaluated and approved, will be submitted for publication, in line with the following items:
1. The version which has been corrected according to the evaluators' suggestions will be remitted and it will be made clear whether the aforementioned article is an unedited document regarding which the rights to be authorized are held and total responsibility will be assumed by the authors for the content of the work being submitted to Ingeniería e Investigación, the Universidad Nacional de Colombia and third-parties;
2. The authorization conferred on the journal will come into force from the date on which it is included in the respective volume and issue of Ingeniería e Investigación in the Open Journal Systems and on the journal's main page (https://revistas.unal.edu.co/index.php/ingeinv), as well as in different databases and indices in which the publication is indexed;
3. The authors authorize the Universidad Nacional de Colombia's journal Ingeniería e Investigación to publish the document in whatever required format (printed, digital, electronic or whatsoever known or yet to be discovered form) and authorize Ingeniería e Investigación to include the work in any indices and/or search engines deemed necessary for promoting its diffusion;
4. The authors accept that such authorization is given free of charge and they, therefore, waive any right to receive remuneration from the publication, distribution, public communication and any use whatsoever referred to in the terms of this authorization.










