Abstract
Predicting students’ academic performance is one of the oldest and most popular applications of educational data mining. It helps to estimate the unknown evaluation of a student’s performance. However, a huge amount of data with different formats and from multiple sources may contain a large number of features supposed as not-relevant that could influence the prediction results. The main objective of this paper is to improve the effectiveness of a predictive model for students’ academic performance. For this purpose, we propose a methodology to carry out a comparative study for evaluating the influence of feature selection techniques on the prediction of students’ academic performance. In our study, F-measure parameter is used to evaluate the effectiveness of the selected techniques. Two real data sources are used in this work, Mathematics and language courses. The outcomes are compared and discussed in order to identify the technique that has the best influence for an accurate predictive model.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
- 2.
European exchange programme that enables student exchange in 31 countries.
References
Romero, C., Ventura, S.: Data mining in education. Wiley Interdisc. Rev. Data Min. Knowl. Discovery 3(1), 12–27 (2013)
Abid, A., Kallel, I., BenAyed, M.: Teamwork construction in e-learning system: a systematic literature review. In: 2016 15th International Conference on Information Technology Based Higher Education and Training (ITHET). IEEE, pp. 1–7 (2016)
Mitra, P., Murthy, C., Pal, S.K.: Unsupervised feature selection using feature similarity. IEEE Trans. Pattern Anal. Mach. Intell. 24(3), 301–312 (2002)
Miller, A.: Subset Selection in Regression. CRC Press, Boca Raton (2002)
Hall, M.A.: Correlation based feature selection for machine learning (1999)
Kira, K., Rendell, L.A.: A practical approach to feature selection. In: Proceedings of the Ninth International Workshop on Machine Learning, pp. 249–256 (1992)
Kononenko, I.: Estimating attributes: analysis and extensions of RELIEF. In: European conference on machine learning, pp. 171–182. Springer (1994)
Quinlan, J.R.: C4.5: Programs for Machine Learning. Elsevier, Amsterdam (2014)
Press, W.H., Teukolsky, S.A., Vetterling, W.T., Flannery, B.P.: Numerical Recipes in C, vol. 2. Cambridge University Press, Cambridge (1996)
Ramaswami, M., Bhaskaran, R.: A study on feature selection techniques in educational data mining. arXiv preprint arXiv:0912.3924 (2009)
Velmurugan, T., Anuradha, C.: Performance evaluation of feature selection algorithms in educational data mining. Perform. Eval. 5(02) (2016)
Costa, E.B., Fonseca, B., Santana, M.A., de Araújo, F.F., Rego, J.: Evaluating the effectiveness of educational data mining techniques for early prediction of students’ academic failure in introductory programming courses. Comput. Hum. Behav. 73, 247–256 (2017)
Bolón-Canedo, V., Sánchez-Maroño, N., Alonso-Betanzos, A.: A review of feature selection methods on synthetic data. Knowl. Inf. Syst. 34(3), 483–519 (2013)
Noura, A., Shili, H., Romdhane, L.B.: Reliable attribute selection based on random forest (RASER). In: International Conference on Intelligent Systems Design and Applications, pp. 11–24. Springer (2017)
Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)
Cohen, W.W.: Fast effective rule induction. In: Proceedings of the Twelfth International Conference on Machine Learning, pp. 115–123 (1995)
Friedman, J., Hastie, T., Tibshirani, R., et al.: Additive logistic regression: a statistical view of boosting (with discussion and a rejoinder by the authors). Ann. Stat. 28(2), 337–407 (2000)
Quinlan, J.R.: C4.5: Programming for Machine Learning, vol. 38. Morgan Kauffmann, Burlington (1993)
Smith, T.C., Frank, E.: Introducing machine learning concepts with WEKA. In: Statistical Genomics: Methods and Protocols, pp. 353–378 (2016)
Márquez-Vera, C., Morales, C.R., Soto, S.V.: Predicting school failure and dropout by using data mining techniques. IEEE Revista Iberoamericana de Tecnologias del Aprendizaje 8(1), 7–14 (2013)
Gu, Q., Cai, Z., Zhu, L., Huang, B.: Data mining on imbalanced data sets. In: IEEE 2008 International Conference on Advanced Computer Theory and Engineering, ICACTE 2008, pp. 1020–1024 (2008)
Cortez, P., Silva, A.M.G.: Using data mining to predict secondary school student performance (2008)
Han, J., Pei, J., Kamber, M.: Data Mining: Concepts and Techniques. Elsevier, Amsterdam (2011)
Volungevičienė, A., Daukšienė, E., Caldirola, E., Blanco, I.J.: Success factors for virtual mobility exchange on open educational resources (2014)
Chatty, A., Kallel, I., Alimi, A.M.: Counter-ant algorithm for evolving multirobot collaboration. In: Proceedings of the 5th International Conference on Soft Computing as Transdisciplinary Science and Technology. ACM, pp. 84–89 (2008)
Abdelkefi, M., Kallel, I.: Towards a fuzzy multiagent tutoring system for M-learners’ emotion regulation. In: 2017 16th International Conference on Information Technology Based Higher Education and Training (ITHET). IEEE, pp. 1–6 (2017)
Acknowledgment
The authors express thanks to the Erasmus+ project for funding the research reported under the Grant Agreement number 2015-1-ES01-K107-015469.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this paper
Cite this paper
Abid, A., Kallel, I., Blanco, I.J., Benayed, M. (2018). Selecting Relevant Educational Attributes for Predicting Students’ Academic Performance. In: Abraham, A., Muhuri, P., Muda, A., Gandhi, N. (eds) Intelligent Systems Design and Applications. ISDA 2017. Advances in Intelligent Systems and Computing, vol 736. Springer, Cham. https://doi.org/10.1007/978-3-319-76348-4_63
Download citation
DOI: https://doi.org/10.1007/978-3-319-76348-4_63
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-76347-7
Online ISBN: 978-3-319-76348-4
eBook Packages: EngineeringEngineering (R0)