Abstract
Educational data mining is a growing academic research area which aims to gain significant insights on student behavior, interactions and performance by applying data mining methods on educational data. During the last decades, a variety of accurate models has been developed to monitor students’ future progress, while most of these studies are based on supervised classification methods. In this work, we propose an ensemble semi-supervised algorithm for the prediction of students’ performance in the final examinations at the end of academic year. The experimental results demonstrate the efficiency and robustness of the proposed algorithm compared to some classical classification algorithms, in terms of accuracy.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Aha, D.: Lazy Learning. Kluwer Academic Publishers, Dordrecht (1997)
Baker, R.S., Inventado, P.S.: Educational data mining and learning analytics. In: Larusson, J.A., White, B. (eds.) Learning Analytics, pp. 61–75. Springer, New York (2014). https://doi.org/10.1007/978-1-4614-3305-7_4
Baker, R., Yacef, K.: The state of educational data mining in 2009: a review future visions. J. Educ. Data Min. 1(1), 3–17 (2009)
Blum, A., Mitchell, T.: Combining labeled and unlabeled data with co-training. In: 11th Annual Conference on Computational Learning Theory, pp. 92–100. ACM (1998)
Cohen, W.: Fast effective rule induction. In: International Conference on Machine Learning, pp. 115–123 (1995)
Cortez, P., Silva, A.: Using data mining to predict secondary school student performance. In: Proceedings of 5th Annual Future Business Technology Conference, pp. 5–12 (2008)
Dietterich, T.G.: Ensemble methods in machine learning. In: Kittler, J., Roli, F. (eds.) MCS 2000. LNCS, vol. 1857, pp. 1–15. Springer, Heidelberg (2000). https://doi.org/10.1007/3-540-45014-9_1
Domingos, P., Pazzani, M.: On the optimality of the simple Bayesian classifier under zero-one loss. Mach. Learn. 29, 103–130 (1997)
Du, J., Ling, C., Zhou, Z.: When does cotraining work in real data? IEEE Trans. Knowl. Data Eng. 23(5), 788–799 (2011)
Finner, H.: On a monotonicity problem in step-down multiple test procedures. J. Am. Stat. Assoc. 88(423), 920–923 (1993)
Gandhi, P., Aggarwal, V.: Ensemble hybrid logit model. In: Proceedings of the KDD 2010 Cup: Workshop Knowledge Discovery in Educational Data, pp. 33–50 (2010)
Guo, T., Li, G.: Improved tri-training with unlabeled data. In: Wu, Y. (ed.) Software Engineering and Knowledge Engineering: Theory and Practice, vol. 115, pp. 139–147. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-25349-2_19
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.: The WEKA data mining software: an update. SIGKDD Explor. Newsl. 11, 10–18 (2009)
Hodges, J., Lehmann, E.: Rank methods for combination of independent experiments in analysis of variance. Ann. Math. Stat. 33(2), 482–497 (1962)
Kostopoulos, G., Kotsiantis, S., Pintelas, P.: Estimating student dropout in distance higher education using semi-supervised techniques. In: Proceedings of the 19th Panhellenic Conference on Informatics, pp. 38–43. ACM (2015)
Kostopoulos, G., Livieris, I., Kotsiantis, S., Tampakas, V.: Enhancing high school students’ performance prediction using semi-supervised methods. In: 8th International Conference on Information, Intelligence, Systems and Applications (IISA 2017). IEEE (2017)
Kotsiantis, S.: Use of machine learning techniques for educational proposes: a decision support system for forecasting students’ grades. Artif. Intell. Rev. 37, 331–344 (2012)
Liu, C., Yuen, P.: A boosted co-training algorithm for human action recognition. IEEE Trans. Circ. Syst. Video Technol. 21(9), 1203–1213 (2011)
Livieris, I.E., Drakopoulou, K., Kotsilieris, T., Tampakas, V., Pintelas, P.: DSS-PSP - a decision support software for evaluating students’ performance. In: Boracchi, G., Iliadis, L., Jayne, C., Likas, A. (eds.) EANN 2017. CCIS, vol. 744, pp. 63–74. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-65172-9_6
Livieris, I., Drakopoulou, K., Tampakas, V., Mikropoulos, T., Pintelas, P.: Predicting secondary school students’ performance utilizing asemi-supervised learning approach. J. Educ. Comput. Res. (2018)
Livieris, I., Mikropoulos, T., Pintelas, P.: A decision support system for predicting students’ performance. Themes Sci. Technol. Educ. 9, 43–57 (2016)
Livieris, I., Drakopoulou, K., Pintelas, P.: Predicting students’ performance using artificial neural networks. In: Information and Communication Technologies in Education, pp. 321–328 (2012)
Marquez-Vera, C., Cano, A., Romero, C., Ventura, S.: Predicting student failure at school using genetic programming and different data mining approaches with high dimensional and imbalanced data. Appl. Intell. 38, 315–330 (2013)
Merz, C.: Combining classifiers using correspondence analysis. In: Advances in Neural Information Processing Systems, pp. 592–597 (1997)
Merz, C.: Using correspondence analysis to combine classifiers. Mach. Learn. 36, 33–58 (1999)
Ng, V., Cardie, C.: Weakly supervised natural language learning without redundant views. In: Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology, vol. 1, pp. 94–101. Association for Computational Linguistics (2003)
Peña-Ayala, A.: Educational data mining: a survey and a data mining-based analysis of recent works. Expert Syst. Appl. 41(4), 1432–1462 (2014)
Platt, J.: Using sparseness and analytic QP to speed training of support vector machines. In: Kearns, M., Solla, S., Cohn, D. (eds.) Advances in Neural Information Processing Systems, pp. 557–563. MIT Press, Cambridge (1999)
Quinlan, J.: C4.5: Programs for Machine Learning. Morgan Kaufmann, San Francisco (1993)
Ramaswami, M., Bhaskaran, R.: A CHAID based performance prediction model in educational data mining. Int. J. Comput. Sci. Issues 7(1), 135–146 (2010)
Ramesh, V., Parkav, P., Rama, K.: Predicting student performance: a statistical and data mining. Int. J. Comput. Appl. 63(8), 35–39 (2013)
Re, M., Valentini, G.: Ensemble methods: a review. In: Advances in Machine Learning and Data Mining for Astronomy, pp. 563–594. Chapman & Hall (2012)
Rokach, L.: Pattern Classification Using Ensemble Methods. World Scientific Publishing Company, Singapore (2010)
Romero, C., Ventura, S.: Educational data mining: a review of the state of the art. IEEE Trans. Syst. Man Cybern. - Part C: Appl. Rev. 40(6), 601–618 (2010)
Romero, C., Ventura, S., Pechenizkiy, S., Baker, M.: Handbook of Educational Data Mining. Chapman & Hall/CRC Data Mining and Knowledge Discovery Series. CRC Press, Boca Raton (2010)
Rumelhart, D., Hinton, G., Williams, R.: Learning internal representations by error propagation. In: Rumelhart, D., McClelland, J. (eds.) Parallel Distributed Processing: Explorations in the Microstructure of Cognition, Massachusetts, Cambridge, pp. 318–362 (1986)
Sokolova, M., Japkowicz, N., Szpakowicz, S.: Beyond accuracy, F-score and ROC: a family of discriminant measures for performance evaluation. In: Sattar, A., Kang, B. (eds.) AI 2006. LNCS (LNAI), vol. 4304, pp. 1015–1021. Springer, Heidelberg (2006). https://doi.org/10.1007/11941439_114
Sun, S., Jin, F.: Robust co-training. Int. J. Pattern Recogn. Artif. Intell. 25(07), 1113–1126 (2011)
Todorovski, L., Džeroski, S.: Combining classifiers with meta decision trees. Mach. Learn. 50(3), 223–249 (2002)
Zhou, Z.: When semi-supervised learning meets ensemble learning. Front. Electr. Electron. Eng. China 6, 6–16 (2011)
Zhou, Z., Li, M.: Tri-training: exploiting unlabeled data using three classifiers. IEEE Trans. knowl. Data Eng. 17(11), 1529–1541 (2005)
Zhu, X.: Semi-supervised learning. In: Sammut, C., Webb, G.I. (eds.) Encyclopedia of Machine Learning and Data Mining, pp. 892–897. Springer, Boston (2017). https://doi.org/10.1007/978-1-4899-7687-1
Zhu, X., Goldberg, A.: Introduction to semi-supervised learning. Synth. Lect. Artif. Intell. Mach. Learn. 3(1), 1–130 (2009)
Acknowledgments
The authors are grateful to the private high school “Avgoulea-Linardatou” for the collection of the data used in our study.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Livieris, I.E., Tampakas, V., Kiriakidou, N., Mikropoulos, T., Pintelas, P. (2019). Forecasting Students’ Performance Using an Ensemble SSL Algorithm. In: Tsitouridou, M., A. Diniz, J., Mikropoulos, T. (eds) Technology and Innovation in Learning, Teaching and Education. TECH-EDU 2018. Communications in Computer and Information Science, vol 993. Springer, Cham. https://doi.org/10.1007/978-3-030-20954-4_43
Download citation
DOI: https://doi.org/10.1007/978-3-030-20954-4_43
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-20953-7
Online ISBN: 978-3-030-20954-4
eBook Packages: Computer ScienceComputer Science (R0)