Skip to main content

Forecasting Students’ Performance Using an Ensemble SSL Algorithm

  • Conference paper
  • First Online:
Technology and Innovation in Learning, Teaching and Education (TECH-EDU 2018)

Abstract

Educational data mining is a growing academic research area which aims to gain significant insights on student behavior, interactions and performance by applying data mining methods on educational data. During the last decades, a variety of accurate models has been developed to monitor students’ future progress, while most of these studies are based on supervised classification methods. In this work, we propose an ensemble semi-supervised algorithm for the prediction of students’ performance in the final examinations at the end of academic year. The experimental results demonstrate the efficiency and robustness of the proposed algorithm compared to some classical classification algorithms, in terms of accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Aha, D.: Lazy Learning. Kluwer Academic Publishers, Dordrecht (1997)

    Book  Google Scholar 

  2. Baker, R.S., Inventado, P.S.: Educational data mining and learning analytics. In: Larusson, J.A., White, B. (eds.) Learning Analytics, pp. 61–75. Springer, New York (2014). https://doi.org/10.1007/978-1-4614-3305-7_4

    Chapter  Google Scholar 

  3. Baker, R., Yacef, K.: The state of educational data mining in 2009: a review future visions. J. Educ. Data Min. 1(1), 3–17 (2009)

    Google Scholar 

  4. Blum, A., Mitchell, T.: Combining labeled and unlabeled data with co-training. In: 11th Annual Conference on Computational Learning Theory, pp. 92–100. ACM (1998)

    Google Scholar 

  5. Cohen, W.: Fast effective rule induction. In: International Conference on Machine Learning, pp. 115–123 (1995)

    Chapter  Google Scholar 

  6. Cortez, P., Silva, A.: Using data mining to predict secondary school student performance. In: Proceedings of 5th Annual Future Business Technology Conference, pp. 5–12 (2008)

    Google Scholar 

  7. Dietterich, T.G.: Ensemble methods in machine learning. In: Kittler, J., Roli, F. (eds.) MCS 2000. LNCS, vol. 1857, pp. 1–15. Springer, Heidelberg (2000). https://doi.org/10.1007/3-540-45014-9_1

    Chapter  Google Scholar 

  8. Domingos, P., Pazzani, M.: On the optimality of the simple Bayesian classifier under zero-one loss. Mach. Learn. 29, 103–130 (1997)

    Article  Google Scholar 

  9. Du, J., Ling, C., Zhou, Z.: When does cotraining work in real data? IEEE Trans. Knowl. Data Eng. 23(5), 788–799 (2011)

    Article  Google Scholar 

  10. Finner, H.: On a monotonicity problem in step-down multiple test procedures. J. Am. Stat. Assoc. 88(423), 920–923 (1993)

    Article  MathSciNet  Google Scholar 

  11. Gandhi, P., Aggarwal, V.: Ensemble hybrid logit model. In: Proceedings of the KDD 2010 Cup: Workshop Knowledge Discovery in Educational Data, pp. 33–50 (2010)

    Google Scholar 

  12. Guo, T., Li, G.: Improved tri-training with unlabeled data. In: Wu, Y. (ed.) Software Engineering and Knowledge Engineering: Theory and Practice, vol. 115, pp. 139–147. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-25349-2_19

    Chapter  Google Scholar 

  13. Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.: The WEKA data mining software: an update. SIGKDD Explor. Newsl. 11, 10–18 (2009)

    Article  Google Scholar 

  14. Hodges, J., Lehmann, E.: Rank methods for combination of independent experiments in analysis of variance. Ann. Math. Stat. 33(2), 482–497 (1962)

    Article  MathSciNet  Google Scholar 

  15. Kostopoulos, G., Kotsiantis, S., Pintelas, P.: Estimating student dropout in distance higher education using semi-supervised techniques. In: Proceedings of the 19th Panhellenic Conference on Informatics, pp. 38–43. ACM (2015)

    Google Scholar 

  16. Kostopoulos, G., Livieris, I., Kotsiantis, S., Tampakas, V.: Enhancing high school students’ performance prediction using semi-supervised methods. In: 8th International Conference on Information, Intelligence, Systems and Applications (IISA 2017). IEEE (2017)

    Google Scholar 

  17. Kotsiantis, S.: Use of machine learning techniques for educational proposes: a decision support system for forecasting students’ grades. Artif. Intell. Rev. 37, 331–344 (2012)

    Article  Google Scholar 

  18. Liu, C., Yuen, P.: A boosted co-training algorithm for human action recognition. IEEE Trans. Circ. Syst. Video Technol. 21(9), 1203–1213 (2011)

    Article  Google Scholar 

  19. Livieris, I.E., Drakopoulou, K., Kotsilieris, T., Tampakas, V., Pintelas, P.: DSS-PSP - a decision support software for evaluating students’ performance. In: Boracchi, G., Iliadis, L., Jayne, C., Likas, A. (eds.) EANN 2017. CCIS, vol. 744, pp. 63–74. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-65172-9_6

    Chapter  Google Scholar 

  20. Livieris, I., Drakopoulou, K., Tampakas, V., Mikropoulos, T., Pintelas, P.: Predicting secondary school students’ performance utilizing asemi-supervised learning approach. J. Educ. Comput. Res. (2018)

    Google Scholar 

  21. Livieris, I., Mikropoulos, T., Pintelas, P.: A decision support system for predicting students’ performance. Themes Sci. Technol. Educ. 9, 43–57 (2016)

    Google Scholar 

  22. Livieris, I., Drakopoulou, K., Pintelas, P.: Predicting students’ performance using artificial neural networks. In: Information and Communication Technologies in Education, pp. 321–328 (2012)

    Google Scholar 

  23. Marquez-Vera, C., Cano, A., Romero, C., Ventura, S.: Predicting student failure at school using genetic programming and different data mining approaches with high dimensional and imbalanced data. Appl. Intell. 38, 315–330 (2013)

    Article  Google Scholar 

  24. Merz, C.: Combining classifiers using correspondence analysis. In: Advances in Neural Information Processing Systems, pp. 592–597 (1997)

    Google Scholar 

  25. Merz, C.: Using correspondence analysis to combine classifiers. Mach. Learn. 36, 33–58 (1999)

    Article  Google Scholar 

  26. Ng, V., Cardie, C.: Weakly supervised natural language learning without redundant views. In: Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology, vol. 1, pp. 94–101. Association for Computational Linguistics (2003)

    Google Scholar 

  27. Peña-Ayala, A.: Educational data mining: a survey and a data mining-based analysis of recent works. Expert Syst. Appl. 41(4), 1432–1462 (2014)

    Article  Google Scholar 

  28. Platt, J.: Using sparseness and analytic QP to speed training of support vector machines. In: Kearns, M., Solla, S., Cohn, D. (eds.) Advances in Neural Information Processing Systems, pp. 557–563. MIT Press, Cambridge (1999)

    Google Scholar 

  29. Quinlan, J.: C4.5: Programs for Machine Learning. Morgan Kaufmann, San Francisco (1993)

    Google Scholar 

  30. Ramaswami, M., Bhaskaran, R.: A CHAID based performance prediction model in educational data mining. Int. J. Comput. Sci. Issues 7(1), 135–146 (2010)

    Google Scholar 

  31. Ramesh, V., Parkav, P., Rama, K.: Predicting student performance: a statistical and data mining. Int. J. Comput. Appl. 63(8), 35–39 (2013)

    Google Scholar 

  32. Re, M., Valentini, G.: Ensemble methods: a review. In: Advances in Machine Learning and Data Mining for Astronomy, pp. 563–594. Chapman & Hall (2012)

    Google Scholar 

  33. Rokach, L.: Pattern Classification Using Ensemble Methods. World Scientific Publishing Company, Singapore (2010)

    MATH  Google Scholar 

  34. Romero, C., Ventura, S.: Educational data mining: a review of the state of the art. IEEE Trans. Syst. Man Cybern. - Part C: Appl. Rev. 40(6), 601–618 (2010)

    Article  Google Scholar 

  35. Romero, C., Ventura, S., Pechenizkiy, S., Baker, M.: Handbook of Educational Data Mining. Chapman & Hall/CRC Data Mining and Knowledge Discovery Series. CRC Press, Boca Raton (2010)

    Book  Google Scholar 

  36. Rumelhart, D., Hinton, G., Williams, R.: Learning internal representations by error propagation. In: Rumelhart, D., McClelland, J. (eds.) Parallel Distributed Processing: Explorations in the Microstructure of Cognition, Massachusetts, Cambridge, pp. 318–362 (1986)

    Google Scholar 

  37. Sokolova, M., Japkowicz, N., Szpakowicz, S.: Beyond accuracy, F-score and ROC: a family of discriminant measures for performance evaluation. In: Sattar, A., Kang, B. (eds.) AI 2006. LNCS (LNAI), vol. 4304, pp. 1015–1021. Springer, Heidelberg (2006). https://doi.org/10.1007/11941439_114

    Chapter  Google Scholar 

  38. Sun, S., Jin, F.: Robust co-training. Int. J. Pattern Recogn. Artif. Intell. 25(07), 1113–1126 (2011)

    Article  MathSciNet  Google Scholar 

  39. Todorovski, L., Džeroski, S.: Combining classifiers with meta decision trees. Mach. Learn. 50(3), 223–249 (2002)

    Article  Google Scholar 

  40. Zhou, Z.: When semi-supervised learning meets ensemble learning. Front. Electr. Electron. Eng. China 6, 6–16 (2011)

    Article  Google Scholar 

  41. Zhou, Z., Li, M.: Tri-training: exploiting unlabeled data using three classifiers. IEEE Trans. knowl. Data Eng. 17(11), 1529–1541 (2005)

    Article  Google Scholar 

  42. Zhu, X.: Semi-supervised learning. In: Sammut, C., Webb, G.I. (eds.) Encyclopedia of Machine Learning and Data Mining, pp. 892–897. Springer, Boston (2017). https://doi.org/10.1007/978-1-4899-7687-1

    Chapter  Google Scholar 

  43. Zhu, X., Goldberg, A.: Introduction to semi-supervised learning. Synth. Lect. Artif. Intell. Mach. Learn. 3(1), 1–130 (2009)

    Article  Google Scholar 

Download references

Acknowledgments

The authors are grateful to the private high school “Avgoulea-Linardatou” for the collection of the data used in our study.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ioannis E. Livieris .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Livieris, I.E., Tampakas, V., Kiriakidou, N., Mikropoulos, T., Pintelas, P. (2019). Forecasting Students’ Performance Using an Ensemble SSL Algorithm. In: Tsitouridou, M., A. Diniz, J., Mikropoulos, T. (eds) Technology and Innovation in Learning, Teaching and Education. TECH-EDU 2018. Communications in Computer and Information Science, vol 993. Springer, Cham. https://doi.org/10.1007/978-3-030-20954-4_43

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-20954-4_43

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-20953-7

  • Online ISBN: 978-3-030-20954-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics