Abstract
Academic performance, a globally understood metric, is utilized worldwide across disparate teaching and learning environments and is regarded as a quantifiable indicator of learning gain. The ability to reliably estimate student’s academic performance is important and can assist academic staff to improve the provision of support. However, it is recognized that academic performance estimation is non-trivial and affected by multiple factors, including a student’s engagement with learning activities and their social, geographic, and demographic characteristics. This paper investigates the opportunity to develop reliable models for predicting student performance using Artificial Intelligence. Specifically, we propose two-step academic performance prediction using feature weighted support vector machine and artificial neural network (ANN) learning. A feature weighted SVM, where the importance of different features to the outcome is calculated using information gain ratios, is employed to perform coarse-grained binary classification (pass, \(P1\), or fail, \(P0\)). Subsequently, detailed score levels are divided from D to A+, and ANN learning is employed for fine-grained, multi-class training of the \(P1\) and \(P0\) classes separately. The experiments and our subsequent ablation study, which are conducted on the student datasets from two Portuguese secondary schools, have proved the effectiveness of this hybridized method.





Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Kurzweil M, Wu DD (2015) Case study: building a pathway to student success at Georgia State University
Ministry of Education of the People’s Republic of China (2016) Promotion rate of graduates of regular school by levels. http://en.moe.gov.cn/Resources/Statistics/edu_stat_2015/2015_en01/201610/t20161012_284485.html. Accessed 4 Feb 2021
Ministry of Education (2020) Composition of students in senior secondary schools—Ministry of Education of the People’s Republic of China. http://en.moe.gov.cn/documents/statistics/2018/national/201908/t20190812_394224.html. Accessed 4 Feb 2021
Kloft M, Stiehler F, Zheng Z, Pinkwart N (2015) Predicting MOOC dropout over weeks using machine learning methods. In: Proceedings of the EMNLP 2014 workshop on analysis of large scale social interaction in MOOCs. Association for Computational Linguistics, Stroudsburg, pp 60–65
Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20:273–297. https://doi.org/10.1007/bf00994018
Chen J, Feng J, Sun X et al (2019) MOOC dropout prediction using a hybrid algorithm based on decision tree and extreme learning machine. Math Probl Eng 2019:1–11. https://doi.org/10.1155/2019/8404653
Sweeney M, Lester J, Rangwala H (2015) Next-term student grade prediction. In: Proceedings of 2015 IEEE international conference on Big Data, IEEE Big Data 2015. IEEE, pp 970–975
Sweeney M, Rangwala H, Lester J, Johri A (2016) Next-term student performance prediction: a recommender systems approach, pp 1–27. https://doi.org/10.5281/zenodo.3554603
Morsy S, Karypis G (2017) Cumulative knowledge-based regression models for next-term grade prediction. In: Chawla N, Wang W (eds) Proceedings of the 17th SIAM international conference on data mining, SDM 2017. Society for Industrial and Applied Mathematics, Philadelphia, pp 552–560
Ren Z, Ning X, Lan AS, Rangwala H (2019) Grade prediction based on cumulative knowledge and co-taken courses. In: EDM 2019—Proceedings of 12th international conference on educational data mining, pp 158–167
Saqlain SM, Sher M, Shah FA et al (2019) Fisher score and Matthews correlation coefficient-based feature subset selection for heart disease diagnosis using support vector machines. Knowl Inf Syst 58:139–167. https://doi.org/10.1007/s10115-018-1185-y
Arsad PM, Buniyamin N, Manan JLA (2014) Prediction of engineering students’ academic performance using artificial neural network and linear regression: a comparison. In: 2013 IEEE 5th International conference on engineering education: aligning engineering education with industrial needs for nation development, ICEED 2013. IEEE, pp 43–48
Hu Q, Rangwala H (2019) Academic performance estimation with attention-based graph convolutional networks. arXiv
Whitehill J, Mohan K, Seaton D et al (2017) Delving deeper into MOOC student dropout prediction. arXiv
Feng W, Tang J, Liu TX (2019) Understanding dropouts in MOOCs. In: 33rd AAAI conference on artificial intelligence AAAI 2019, 31st Innovative applications of artificial intelligence conference IAAI 2019, 9th AAAI symposium on educational advances in artificial intelligence EAAI 2019, vol 33, pp 517–524. https://doi.org/https://doi.org/10.1609/aaai.v33i01.3301517
Fei M, Yeung DY (2016) Temporal models for predicting student dropout in massive open online courses. In: Proceedings of 15th IEEE international conference on data mining workshop, ICDMW 2015. IEEE, pp 256–263
Basheer IA, Hajmeer M (2000) Artificial neural networks: Fundamentals, computing, design, and application. J Microbiol Methods 43:3–31. https://doi.org/10.1016/S0167-7012(00)00201-3
Dai J, Xu Q (2013) Attribute selection based on information gain ratio in fuzzy rough set theory with application to tumor classification. Appl Soft Comput J 13:211–221. https://doi.org/10.1016/j.asoc.2012.07.029
Cortez P, Silva A (2008) Using data mining to predict secondary school student performance. In: 15th European concurrent engineering conference 2008, ECEC 2008; 5th Future of business technology conference, FUBUTEC 2008, pp 5–12
Abouelmagd EI, Awad ME, Elzayat EMA, Abbas IA (2014) Reduction the secular solution to periodic solution in the generalized restricted three-body problem. Astrophys Space Sci 350:495–505. https://doi.org/10.1007/s10509-013-1756-z
Liu Y, Liu W, Obaid MA, Abbas IA (2016) Exponential stability of Markovian jumping Cohen-Grossberg neural networks with mixed mode-dependent time-delays. Neurocomputing 177:409–415. https://doi.org/10.1016/j.neucom.2015.11.046
Du B, Liu Y, Atiatallah Abbas I (2016) Existence and asymptotic behavior results of periodic solution for discrete-time neutral-type neural networks. J Frankl Inst 353:448–461. https://doi.org/10.1016/j.jfranklin.2015.11.013
Chambers LG, Fletcher R (2001) Practical methods of optimization. Math Gaz 85:562. https://doi.org/10.2307/3621816
He X, Ji M, Zhang C, Bao H (2011) A variance minimization criterion to feature selection using Laplacian regularization. IEEE Trans Pattern Anal Mach Intell 33:2013–2025. https://doi.org/10.1109/TPAMI.2011.44
Zhao Z, Zhang R, Cox J et al (2013) Massively parallel feature selection: an approach based on variance preservation. Mach Learn 92:195–220. https://doi.org/10.1007/s10994-013-5373-4
Jin C, Ma T, Hou R et al (2015) Chi-square statistics feature selection based on term frequency and distribution for text categorization. IETE J Res 61:351–362. https://doi.org/10.1080/03772063.2015.1021385
Lee C, Lee GG (2006) Information gain and divergence-based feature selection for machine learning-based text categorization. Inf Process Manag 42:155–165. https://doi.org/10.1016/j.ipm.2004.08.006
Uǧuz H (2011) A two-stage feature selection method for text categorization by using information gain, principal component analysis and genetic algorithm. Knowl Based Syst 24:1024–1032. https://doi.org/10.1016/j.knosys.2011.04.014
Safavian SR, Landgrebe D (1991) A survey of decision tree classifier methodology. IEEE Trans Syst Man Cybern 21:660–674. https://doi.org/10.1109/21.97458
Wang F, Wang Q, Nie F et al (2020) A linear multivariate binary decision tree classifier based on K-means splitting. Pattern Recognit 107:107521. https://doi.org/10.1016/j.patcog.2020.107521
Keller JM, Gray MR (1985) A fuzzy K-nearest neighbor algorithm. IEEE Trans Syst Man Cybern SMC 15:580–585. https://doi.org/10.1109/TSMC.1985.6313426
Denoeux T (1995) A k-nearest neighbor classification rule based on Dempster-Shafer theory. IEEE Trans Syst Man Cybern 25:804–813. https://doi.org/10.1109/21.376493
Hinton G, Deng L, Yu D et al (2012) Deep neural networks for acoustic modeling in speech recognition: the shared views of four research groups. IEEE Signal Process Mag 29:82–97. https://doi.org/10.1109/MSP.2012.2205597
Chai T, Draxler RR (2014) Root mean square error (RMSE) or mean absolute error (MAE)? Arguments against avoiding RMSE in the literature. Geosci Model Dev 7:1247–1250. https://doi.org/10.5194/gmd-7-1247-2014
Swamidass PM (2000) Mean absolute percentage error (MAPE). In: Encyclopedia of production and manufacturing management. Springer, Boston, pp 462–462
Funding
This study was funded by the Fundamental Research Funds for the Central Universities (grant numbers 20720200094).
Author information
Authors and Affiliations
Corresponding authors
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest or personal relationships that could have appeared to influence the work reported in this paper.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Huang, C., Zhou, J., Chen, J. et al. A feature weighted support vector machine and artificial neural network algorithm for academic course performance prediction. Neural Comput & Applic 35, 11517–11529 (2023). https://doi.org/10.1007/s00521-021-05962-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-021-05962-3