Skip to main content

Predicting Student Dropouts in Higher Education Using Supervised Classification Algorithms

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 10962))

Abstract

The aim of this paper is to predict, on a purely algorithmic basis, students who are at risk of dropping out of university. Data used in this study originated from the University of Bari Aldo Moro, during 2013–16, and were provided by the Osservatorio Studenti-Didattica of Miur-Cineca. Data analysis is based solely on the information set available, for each student, inside the university information system. Predictions of individual dropouts have been carried out by means of suitable Machine Learning techniques, known as supervised classification algorithms.

This research work research has been carried out as part of the activities of the research program no. 13/08 of the Ionic Department of the University of Bari Aldo Moro, entitled: “Analysis of Business Intelligence aimed at intercepting abandonment by university students enrolled in the Ionic Department”, included in the program agreement with the Municipality of Taranto for the 2011/2013 period, to contribute to the consolidation of the Ionic University Facilities. We would like to thank the Rector of the University of Bari for authorizing the consultation, in an anonymous form and for scientific research purposes, of the data relating to the UniBA student population of the MIUR-Cineca Student-Didactic Observatory. Finally, we wish to thank the dott. Massimo Iaquinta, responsible of the U.O. Statistiche di Ateneo, Settore Servizi Istituzionali, for the support offered to the retrieval of data. The authors collaborated in equal parts to the writing of the present essay.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Agenzia Nazionale di Valutazione del Sistema Universitario e della Ricerca. Rapporto Integrale sullo Stato del Sistema Universitario e della Ricerca (2016)

    Google Scholar 

  2. Comitato Nazionale per la Valutazione del Sistema Universitario. Undicesimo Rapporto sullo Stato del Sistema Universitario. Ministero dell’Istruzione, dell’Università e della Ricerca (2011)

    Google Scholar 

  3. Bennett, R.: Determinants of undergraduate student drop out rates in a university business studies department. J. Further High. Educ. 27(2), 123–141 (2003). https://doi.org/10.1080/030987703200065154

    Article  Google Scholar 

  4. Di Pietro, G.: The determinants of university dropout in Italy: a bivariate Probit model with sample selection. Appl. Econ. Lett. 11, 187–191 (2004)

    Article  Google Scholar 

  5. Belloc, F., Maruotti, A., Petrella, L.: University drop-out: an Italian experience. High. Educ. 60(2), 127–138 (2009). https://doi.org/10.1007/s10734-009-9290-1

    Article  Google Scholar 

  6. Mohamad, S.K., Tasir, Z.: Educational data mining: a review. Procedia - Soc. Behav. Sci. 97, 320–324 (2013). https://doi.org/10.1016/j.sbspro.2013.10.240

    Article  Google Scholar 

  7. Baker, R.S.J.D.: Data mining for education. In: McGaw, B., Peterson, P., Baker, E. (eds.) International Encyclopedia of Education, 3rd edn. Elsevier, Oxford (2017)

    Google Scholar 

  8. Koedinger, K.R., D’Mello, S., McLaughlin, E.A., Pardos, Z.A., Rosé, C.P.: Data mining and education. WIREs Cogn. Sci. 6(4), 333–353 (2015). https://doi.org/10.1002/wcs.1350

    Article  Google Scholar 

  9. Calvet Liñán, L., Juan Pérez, A.A.: Educational data mining and learning analytics: differences, similarities, and time evolution. Int. J. Educ. Technol. High. Educ. 12(3), 98–112 (2015). https://doi.org/10.7238/rusc.v12i3.2515.

  10. Mitchell, T.M.: Machine Learning, 1st edn. McGraw-Hill, Inc., New York (1997)

    MATH  Google Scholar 

  11. Kumar, M., Singh, A.J., Handa, D.: Literature survey on educational dropout prediction. Int. J. Educ. Manag. Eng. 7(2), 8–19 (2017). https://doi.org/10.5815/ijeme.2017.02.02

    Article  Google Scholar 

  12. Willging, P.A., Johnson, S.D.: Factors that influence students’ decision to dropout of online courses. J. Asynchr. Learn. Netw. 8(4), 105–118 (2004)

    Google Scholar 

  13. Dekker, G., Pechenizkiy, M., Vleeshouwers, J.: Predicting students drop out: a case study. In: Barnes, T., Desmarais, M.C., Romero, C., Ventura, S. (eds.) EDM, pp. 41–50 (2009). www.educationaldatamining.org

  14. Şara, N.B., Halland, R., Igel, C., Alstrup, S.: High-school dropout prediction using machine learning: a Danish large-scale study. In: Verleysen, M. (ed.) Proceedings of ESANN 2015: 23rd European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning, pp. 319–324 (2015)

    Google Scholar 

  15. Tekin, A.: Early prediction of students’ grade point averages at graduation: a data mining approach. Eurasian J. Educ. Res. 54, 207–226 (2014)

    Article  Google Scholar 

  16. Rios, G., Reyes, N., Juárez, M., Espitia, E., Mosqueda, J., Soria, M.: Predicting early students with high risk to drop out of university using a neural network-based approach. In: The Eighth International Multi-Conference on Computing in the Global Information Technology, ICCGI 2013, pp. 289–294 (2013). ISBN 978-1-61208-283-7

    Google Scholar 

  17. Teshnizi, S., Ayatollahi, S.: A comparison of logistic regression model and artificial neural networks in predicting of student’s academic failure. Acta Informatica Med. 23(5), 296–300 (2015). https://doi.org/10.5455/aim.2015.23.296-300

    Article  Google Scholar 

  18. Márquez-Vera, C., Morales, C.R., Soto, S.V.: Predicting school failure and dropout by using data mining techniques. IEEE Rev. Iberoamericana de Tecnologias Del Aprendizaje 8(1), 7–14 (2013). https://doi.org/10.1109/RITA.2013.2244695

    Article  Google Scholar 

  19. Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning. SSS. Springer, New York (2009). https://doi.org/10.1007/978-0-387-84858-7

    Book  MATH  Google Scholar 

  20. Ng, A.Y., Jordan, M.I.: On discriminative vs. generative classifiers: a comparison of logistic regression and Naïve Bayes. In: Dietterich, T.G., Becker, S., Ghahramani, Z. (eds.) Advances in Neural Information Processing Systems 14, pp. 841–848. MIT Press (2002)

    Google Scholar 

  21. Firth, D.: Generalized linear models. in statistical theory and modelling. In: Hinkley, D.V., Reid, N., Snell, E.J. (eds.) Honour of Sir David Cox, FRS, pp. 55–82. Chapman and Hall, London (1990)

    Google Scholar 

  22. Cessie, S., van Houwelingen, J.C.: Ridge estimators in logistic regression. Appl. Stat. 41(1), 191–201 (1992)

    Article  Google Scholar 

  23. Bartlett, P.: Statistical learning and VC theory. In: Tutorial Guide. ISCAS 2001. IEEE International Symposium on Circuits and Systems, pp. 4.2.1–4.2.16 (2001). https://doi.org/10.1109/TUTCAS.2001.946954

  24. Manning, C.D., Raghavan, P., Schütze, H.: Introduction to Information Retrieval. Cambridge University Press, New York (2008)

    Book  Google Scholar 

  25. Vapnik, V.N.: An overview of statistical learning theory. IEEE Trans. Neural Netw. 10(5), 988–999 (1999). https://doi.org/10.1109/72.788640

    Article  Google Scholar 

  26. Parikh, R., Mathai, A., Parikh, S., Chandra Sekhar, G., Thomas, R.: Understanding and using sensitivity, specificity and predictive values. Indian J. Ophthalm. 56(1), 45–50 (2008). PMID: 18580002

    Article  Google Scholar 

  27. Márquez-Vera, C., Cano, A., Romero, C., Noaman, A.Y.M., Mousa Fardoun, H., Ventura, S.: Early dropout prediction using data mining: a case study with high school students. Expert Syst. 33(1), 107–124 (2016)

    Article  Google Scholar 

  28. Liu, B.: Web Data Mining Exploring Hyperlinks, Contents, and Usage Data, 2nd edn. Springer, Berlin (2011). https://doi.org/10.1007/978-3-642-19460-3

    Book  MATH  Google Scholar 

  29. Fawcett, T.: An introduction to ROC analysis. Pattern Recogn. Lett. 27(8), 861–874 (2006). https://doi.org/10.1016/j.patrec.2005.10.010

    Article  MathSciNet  Google Scholar 

  30. Witten, I.H., Frank, E., Hall, M.A., Pal, C.: Data Mining, 4th edn. Morgan Kaufman, Burlington (2016). ISBN 978-0-12804-291-5

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Massimo Bilancia .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG, part of Springer Nature

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Serra, A., Perchinunno, P., Bilancia, M. (2018). Predicting Student Dropouts in Higher Education Using Supervised Classification Algorithms. In: Gervasi, O., et al. Computational Science and Its Applications – ICCSA 2018. ICCSA 2018. Lecture Notes in Computer Science(), vol 10962. Springer, Cham. https://doi.org/10.1007/978-3-319-95168-3_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-95168-3_2

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-95167-6

  • Online ISBN: 978-3-319-95168-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics