Abstract
The aim of this paper is to predict, on a purely algorithmic basis, students who are at risk of dropping out of university. Data used in this study originated from the University of Bari Aldo Moro, during 2013–16, and were provided by the Osservatorio Studenti-Didattica of Miur-Cineca. Data analysis is based solely on the information set available, for each student, inside the university information system. Predictions of individual dropouts have been carried out by means of suitable Machine Learning techniques, known as supervised classification algorithms.
This research work research has been carried out as part of the activities of the research program no. 13/08 of the Ionic Department of the University of Bari Aldo Moro, entitled: “Analysis of Business Intelligence aimed at intercepting abandonment by university students enrolled in the Ionic Department”, included in the program agreement with the Municipality of Taranto for the 2011/2013 period, to contribute to the consolidation of the Ionic University Facilities. We would like to thank the Rector of the University of Bari for authorizing the consultation, in an anonymous form and for scientific research purposes, of the data relating to the UniBA student population of the MIUR-Cineca Student-Didactic Observatory. Finally, we wish to thank the dott. Massimo Iaquinta, responsible of the U.O. Statistiche di Ateneo, Settore Servizi Istituzionali, for the support offered to the retrieval of data. The authors collaborated in equal parts to the writing of the present essay.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Agenzia Nazionale di Valutazione del Sistema Universitario e della Ricerca. Rapporto Integrale sullo Stato del Sistema Universitario e della Ricerca (2016)
Comitato Nazionale per la Valutazione del Sistema Universitario. Undicesimo Rapporto sullo Stato del Sistema Universitario. Ministero dell’Istruzione, dell’Università e della Ricerca (2011)
Bennett, R.: Determinants of undergraduate student drop out rates in a university business studies department. J. Further High. Educ. 27(2), 123–141 (2003). https://doi.org/10.1080/030987703200065154
Di Pietro, G.: The determinants of university dropout in Italy: a bivariate Probit model with sample selection. Appl. Econ. Lett. 11, 187–191 (2004)
Belloc, F., Maruotti, A., Petrella, L.: University drop-out: an Italian experience. High. Educ. 60(2), 127–138 (2009). https://doi.org/10.1007/s10734-009-9290-1
Mohamad, S.K., Tasir, Z.: Educational data mining: a review. Procedia - Soc. Behav. Sci. 97, 320–324 (2013). https://doi.org/10.1016/j.sbspro.2013.10.240
Baker, R.S.J.D.: Data mining for education. In: McGaw, B., Peterson, P., Baker, E. (eds.) International Encyclopedia of Education, 3rd edn. Elsevier, Oxford (2017)
Koedinger, K.R., D’Mello, S., McLaughlin, E.A., Pardos, Z.A., Rosé, C.P.: Data mining and education. WIREs Cogn. Sci. 6(4), 333–353 (2015). https://doi.org/10.1002/wcs.1350
Calvet Liñán, L., Juan Pérez, A.A.: Educational data mining and learning analytics: differences, similarities, and time evolution. Int. J. Educ. Technol. High. Educ. 12(3), 98–112 (2015). https://doi.org/10.7238/rusc.v12i3.2515.
Mitchell, T.M.: Machine Learning, 1st edn. McGraw-Hill, Inc., New York (1997)
Kumar, M., Singh, A.J., Handa, D.: Literature survey on educational dropout prediction. Int. J. Educ. Manag. Eng. 7(2), 8–19 (2017). https://doi.org/10.5815/ijeme.2017.02.02
Willging, P.A., Johnson, S.D.: Factors that influence students’ decision to dropout of online courses. J. Asynchr. Learn. Netw. 8(4), 105–118 (2004)
Dekker, G., Pechenizkiy, M., Vleeshouwers, J.: Predicting students drop out: a case study. In: Barnes, T., Desmarais, M.C., Romero, C., Ventura, S. (eds.) EDM, pp. 41–50 (2009). www.educationaldatamining.org
Şara, N.B., Halland, R., Igel, C., Alstrup, S.: High-school dropout prediction using machine learning: a Danish large-scale study. In: Verleysen, M. (ed.) Proceedings of ESANN 2015: 23rd European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning, pp. 319–324 (2015)
Tekin, A.: Early prediction of students’ grade point averages at graduation: a data mining approach. Eurasian J. Educ. Res. 54, 207–226 (2014)
Rios, G., Reyes, N., Juárez, M., Espitia, E., Mosqueda, J., Soria, M.: Predicting early students with high risk to drop out of university using a neural network-based approach. In: The Eighth International Multi-Conference on Computing in the Global Information Technology, ICCGI 2013, pp. 289–294 (2013). ISBN 978-1-61208-283-7
Teshnizi, S., Ayatollahi, S.: A comparison of logistic regression model and artificial neural networks in predicting of student’s academic failure. Acta Informatica Med. 23(5), 296–300 (2015). https://doi.org/10.5455/aim.2015.23.296-300
Márquez-Vera, C., Morales, C.R., Soto, S.V.: Predicting school failure and dropout by using data mining techniques. IEEE Rev. Iberoamericana de Tecnologias Del Aprendizaje 8(1), 7–14 (2013). https://doi.org/10.1109/RITA.2013.2244695
Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning. SSS. Springer, New York (2009). https://doi.org/10.1007/978-0-387-84858-7
Ng, A.Y., Jordan, M.I.: On discriminative vs. generative classifiers: a comparison of logistic regression and Naïve Bayes. In: Dietterich, T.G., Becker, S., Ghahramani, Z. (eds.) Advances in Neural Information Processing Systems 14, pp. 841–848. MIT Press (2002)
Firth, D.: Generalized linear models. in statistical theory and modelling. In: Hinkley, D.V., Reid, N., Snell, E.J. (eds.) Honour of Sir David Cox, FRS, pp. 55–82. Chapman and Hall, London (1990)
Cessie, S., van Houwelingen, J.C.: Ridge estimators in logistic regression. Appl. Stat. 41(1), 191–201 (1992)
Bartlett, P.: Statistical learning and VC theory. In: Tutorial Guide. ISCAS 2001. IEEE International Symposium on Circuits and Systems, pp. 4.2.1–4.2.16 (2001). https://doi.org/10.1109/TUTCAS.2001.946954
Manning, C.D., Raghavan, P., Schütze, H.: Introduction to Information Retrieval. Cambridge University Press, New York (2008)
Vapnik, V.N.: An overview of statistical learning theory. IEEE Trans. Neural Netw. 10(5), 988–999 (1999). https://doi.org/10.1109/72.788640
Parikh, R., Mathai, A., Parikh, S., Chandra Sekhar, G., Thomas, R.: Understanding and using sensitivity, specificity and predictive values. Indian J. Ophthalm. 56(1), 45–50 (2008). PMID: 18580002
Márquez-Vera, C., Cano, A., Romero, C., Noaman, A.Y.M., Mousa Fardoun, H., Ventura, S.: Early dropout prediction using data mining: a case study with high school students. Expert Syst. 33(1), 107–124 (2016)
Liu, B.: Web Data Mining Exploring Hyperlinks, Contents, and Usage Data, 2nd edn. Springer, Berlin (2011). https://doi.org/10.1007/978-3-642-19460-3
Fawcett, T.: An introduction to ROC analysis. Pattern Recogn. Lett. 27(8), 861–874 (2006). https://doi.org/10.1016/j.patrec.2005.10.010
Witten, I.H., Frank, E., Hall, M.A., Pal, C.: Data Mining, 4th edn. Morgan Kaufman, Burlington (2016). ISBN 978-0-12804-291-5
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this paper
Cite this paper
Serra, A., Perchinunno, P., Bilancia, M. (2018). Predicting Student Dropouts in Higher Education Using Supervised Classification Algorithms. In: Gervasi, O., et al. Computational Science and Its Applications – ICCSA 2018. ICCSA 2018. Lecture Notes in Computer Science(), vol 10962. Springer, Cham. https://doi.org/10.1007/978-3-319-95168-3_2
Download citation
DOI: https://doi.org/10.1007/978-3-319-95168-3_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-95167-6
Online ISBN: 978-3-319-95168-3
eBook Packages: Computer ScienceComputer Science (R0)