Abstract
Student attrition in higher educational institutions (HEI) concern with the failure of undergraduate students who unable to complete their studies within the stipulated period. Student attrition problem relates to the resource’s usage in which dropout students still use the same resources as graduated students though they do not yield any outcomes. Hence, HEI efforts to curb the percentage of student attrition numbers would have positive impact on the productivity. In a similar vein, findings from previous studies highlight numerous factors that contributed to student attrition. These factors vary from one case to another depending on the case profile. In such cases, the historical or past data can provide useful insights in understanding the factors of student attrition in an institution. In this paper, we discuss data mining techniques primarily on the supervised classification algorithms for predicting student attrition. We use the Cross-Industry Standard Process for Data Mining (CRISP-DM) that comprises of five phases for the case study. Both evaluation methods, the cross-validation and percentage split have been used to evaluate the classification methods. The study has identified the significant attributes for the student attrition prediction which are Cumulative Grade Point Average (CGPA), sponsor, family income, disability and the number of dependent. Support Vector Machine with Polynomial Kernel appeared to be the best method from the five tested algorithms.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Tackling lack of interest in STEM subjects: Malaysia Education Hub (2017). http://www.edumsia.my/article/tackling-lack-of-interest-in-stem-subjects
Chen, Y., Johri, A., Rangwala, H.: Running out of STEM: a comparative study across STEM majors of college students at-risk of dropping out early. In: Proceedings of the 8th International Conference on Learning Analytics and Knowledge, pp. 270–279 (2018)
Enhancing Academic Productivity and Cost Efficiency, Ministry of Higher Education Malaysia (2017)
Taipe, M.A., Mauricio, D.: Predicting university dropout through data mining: a systematic literature. Indian J. Sci. Technol. 12(4), 1–12 (2019)
Adusei-asante, K., Doh, D.: Students’ attrition and retention in higher education: a conceptual discussion, pp. 1–10 (2016)
Viale Tudela, E.H.: A theoretical approach to the college student drop out. Revista Digital de Investigación En Docencia Universitaria (RIDU) 8(1), 59–74 (2014)
Hoffait, A., Schyns, M.: Early detection of university students with potential difficulties. Decis. Support Syst. 101, 1–11 (2017)
Hegde, V., Prageeth, P.P.: Higher education student dropout prediction and analysis through educational data mining. In: Proceedings of the Second International Conference on Inventive Systems and Control (ICISC 2018), pp. 694–699 (2018)
Kang, K., Wang, S.: Analyze and predict student dropout from online programs. In: Proceedings of the 2nd International Conference on Compute and Data Analysis, pp. 6–12 (2018)
Martins, L.C.B., Carvalho, R.N., Carvalho, R.S., Victoria, M.C., Holanda, M.: Early prediction of college attrition using data mining. In: Machine Learning and Applications (ICMLA), pp. 1075–1078 (2017)
Rahman, N.A.A., Tan, K.L., Lim, C.K.: Supervised and unsupervised learning in data mining for employment prediction of fresh graduate students. J. Telecommun. Electron. Comput. Eng. 9(2), 155–161 (2017)
Sangodiah, A., Beleya, P., Muniandy, M., Heng, L.E., Spr, C.R.: Minimizing student attrition in higher learning institutions in malaysia using support vector machine. J. Theor. Appl. Inf. Technol. 71(3), 377–385 (2015)
Jamil, N.I., Baharuddin, F.N., Maknu, T.S.R.: Factors mining in engaging students learning styles using exploratory factor analysis. Procedia Econ. Finance 31, 722–729 (2015)
Bakhshinategh, B., Zaiane, O.R., Elatia, S., Ipperciel, D.: Educational data mining applications and tasks: a survey of the last 10 years. Educ. Inf. Technol. 23(1), 537–553 (2017)
Yusof, N.N., Mohamed, A., Abdul-Rahman, S.: Reviewing classification approaches in sentiment analysis. In: Berry, M.W., Mohamed, A.H., Wah, Y.B. (eds.) SCDS 2015. CCIS, vol. 545, pp. 43–53. Springer, Singapore (2015). https://doi.org/10.1007/978-981-287-936-3_5
Saa, A.A.: Educational data mining & students’ performance prediction. Int. J. Adv. Comput. Sci. Appl. 7(5), 212–220 (2016)
Bilal, M., Israr, H., Shahid, M., Khan, A.: Sentiment classification of Roman-Urdu opinions using Naïve Bayesian, Decision Tree and KNN classification techniques. J. King Saud Univ. Comput. Inf. Sci. 28(3), 330–344 (2016)
Ab Jamil, M.Z., Mutalib, S., Abdul-Rahman, S., Aziz, Z.A.: Classification of paddy weed leaf using neuro-fuzzy methods. Malays. J. Comput. (MJoC), 3(1), 54–66 (2018)
Zamani, N.W., Mohd Khairi, S.S.: A comparative study on data mining techniques for rainfall prediction in subang. In: AIP Conference Proceedings 2013, p. 020042 (2018)
Wan Yaacob, W.F., Md Nasir, S.A., Wan Yaacob, W.F., Mohd Sobri, N.: Supervised data mining approach for predicting student performance. Indonesian J. Electr. Eng. Comput. Sci. 16(3), 1584–1592 (2019)
Yusoff, M., Jefri, N.J., Kahar, M.S.: Sequential minimal optimization algorithm with support vector machine for mosquito Larvae identification. Adv. Sci. Lett. 23(5), 4274–4277 (2017)
Ibrahim, Z., Kasiran, Z., Isa, D., Sabri, N.: Multi-script text detection and classification from natural scenes. In: Berry, M.W., Hj. Mohamed, A., Yap, B.W. (eds.) SCDS 2016. CCIS, vol. 652, pp. 200–210. Springer, Singapore (2016). https://doi.org/10.1007/978-981-10-2777-2_18
Sivakumar, S., Venkataraman, S., Selvaraj, R.: Predictive modeling of student dropout indicators in educational data mining using improved decision tree. Indian J. Sci. Technol. 9(4), 1–5 (2016)
Mittal, V., Anuradha: A real time data mining model to predict academic attrition. Int. J. Res. Sci. Eng. Technol. 4(7), 46–54 (2017)
Frank, E., Hall, M.A., Witten, I.A.: The WEKA Workbench. Online Appendix for “Data Mining: Practical Machine Learning Tools and Techniques, 4th edn. Morgan Kaufmann(2016)
Kamaru-Zaman, E.A., Brass, A., Weatherall, J., Rahman, S.A.: Weak classifiers performance measure in handling noisy clinical trial data. In: Berry, M.W., Hj. Mohamed, A., Yap, B.W. (eds.) SCDS 2016. CCIS, vol. 652, pp. 148–157. Springer, Singapore (2016). https://doi.org/10.1007/978-981-10-2777-2_13
Acknowledgement
The authors would like to thank Research Management Centre and Center of Strategic Planning and Information (CSPI) of Universiti Teknologi MARA in supporting the research.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Ahmad Tarmizi, S.S., Mutalib, S., Abdul Hamid, N.H., Abdul-Rahman, S., Md Ab Malik, A. (2019). A Case Study on Student Attrition Prediction in Higher Education Using Data Mining Techniques. In: Berry, M., Yap, B., Mohamed, A., Köppen, M. (eds) Soft Computing in Data Science. SCDS 2019. Communications in Computer and Information Science, vol 1100. Springer, Singapore. https://doi.org/10.1007/978-981-15-0399-3_15
Download citation
DOI: https://doi.org/10.1007/978-981-15-0399-3_15
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-0398-6
Online ISBN: 978-981-15-0399-3
eBook Packages: Computer ScienceComputer Science (R0)