Abstract
School dropout and academic underachievement have significant effects on economic growth and employment in society. This phenomenon impacts not only the intellectual development of students but also their access to desirable job opportunities, which can improve their quality of life. This paper focuses on school dropout in the Predict Students’ Dropout and Academic Success dataset. We use three resampling techniques for the imbalanced problem: Adaptive Synthetic, Support Vector Machine-Synthetic Minority Oversampling Technique, and Synthetic Minority Oversampling Technique+Edited Nearest Neighbor. We also compare the performance of the Random Forest, Support Vector Machine, and XGBoost classifiers between the default hyperparameter configuration and Bayesian configuration. The Synthetic Minority Oversampling Technique+Edited Nearest Neighbor technique obtained the best average performance using the Support Vector Machine classifier, achieving an accuracy of 93.55% and a precision of 94.11%.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Rochin Berumen, F. L.: Deserción escolar en la educación superior en México: revisión de literatura. RIDE. Rev. Iberoam. Investig. Desarro. 11(22) (2021)
Kuz, A., Morales, R.: Ciencia de Datos Educativos y aprendizaje automático: Un caso de estudio sobre la deserción estudiantil universitaria en México. Educ. Knowl. Soc. (EKS) e30080 (2023)
Wan Yaacob, W.F., et al.: Predicting student drop-out in higher institution using data mining techniques. J. Phys: Conf. Ser. 1496, 012005 (2020)
Realinho, V., Machado, J., Baptista, L., Martins, M.V.: Predicting student dropout and academic success. Data 7(11), 146 (2022)
Niyogisubizo, J., Liao, L., Nziyumva, E., Murwanashyaka, E., Nshimyumukiza, P.C.: Predicting student’s dropout in university classes using two-layer ensemble machine learning approach: a novel stacked generalization. Comput. Educ.: Artif. Intell. 3, 100066 (2022)
Kaggle Homepage. https://www.kaggle.com/datasets/thedevastator/higher-education-predictors-of-student-retention/data. Accessed 14 Oct 2023
He, H., Bai, Y., Garcia, E.A., Li, S.: ADASYN: adaptive synthetic sampling approach for imbalanced learning. In: IEEE International Joint Conference on Neural Networks (IEEE world congress on computational intelligence), pp. 1322–1328. IEEE, Hong Kong (2008)
Nguyen, H.M., Cooper, E.W., Kamei, K.: Borderline over-sampling for imbalanced data classification. Int. J. Knowl. Eng. Soft Data Paradig. 3(1), 4–21 (2009)
Batista, G.E.A.P.A., Prati, R.C., Monard, M.C.: A study of the behavior of several methods for balancing machine learning training data. J. SIGKDD Explor. Newsl. 6(1), 20–29 (2004)
Shahriari, B., Swersky, K., Wang, Z., Adams, R.P., De Freitas, N.: Taking the human out of the loop: a review of Bayesian optimization. In: Proceedings of the IEEE, pp. 148–175. IEEE (2015)
Hernández, Y., Martínez, A., Estrada, H., Ortiz, J., Acevedo, C.: Machine learning approach for personality recognition in Spanish texts. Appl. Sci. 12(6), 2985 (2022)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Cuevas-Chávez, P.A., Narciso, S., Sánchez-Jiménez, E., Pérez, I.C., Hernández, Y., Ortiz-Hernandez, J. (2024). School Dropout Prediction with Class Balancing and Hyperparameter Configuration. In: Calvo, H., Martínez-Villaseñor, L., Ponce, H., Zatarain Cabada, R., Montes Rivera, M., Mezura-Montes, E. (eds) Advances in Computational Intelligence. MICAI 2023 International Workshops. MICAI 2023. Lecture Notes in Computer Science(), vol 14502. Springer, Cham. https://doi.org/10.1007/978-3-031-51940-6_2
Download citation
DOI: https://doi.org/10.1007/978-3-031-51940-6_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-51939-0
Online ISBN: 978-3-031-51940-6
eBook Packages: Computer ScienceComputer Science (R0)