Predicting the percentage of student placement: A comparative study of machine learning algorithms

Çakıt, Erman; Dağdeviren, Metin

doi:10.1007/s10639-021-10655-4

Predicting the percentage of student placement: A comparative study of machine learning algorithms

Published: 02 July 2021

Volume 27, pages 997–1022, (2022)
Cite this article

Education and Information Technologies Aims and scope Submit manuscript

968 Accesses
8 Citations
Explore all metrics

Abstract

In recent years, there has been an increase in the demand for higher education in Turkey, where the demand, as in most other countries, exceeds what is available. The main purpose of this research is to develop machine learning algorithms for predicting the percentage of student placement based on the data related to the university’s academic reputation, opportunities of the city where the university is located, facilities and cultural opportunities of the university. When the model accuracy was evaluated on the basis of performance metrics, the Extreme Gradient Boosting (XGBoost) algorithm showed greater predictive accuracy than other machine learning approaches. A sensitivity analysis was performed using the extreme gradient boosting machines algorithm to identify the degree to which the input variables contribute to the determination of the output variable. Five input variables, namely the percentage of student placement at year t-1, the university scientific document score, university Phd programme score, university faculty member/student score, and the percentage of student placement at year t-2 were found to be the most effective parameters. Prediction and sensitivity analysis results obtained in this study can be used in many different ways, such as determining the quotas for universities, allocating resources, and making new regulations.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Imbalanced data preprocessing techniques for machine learning: a systematic mapping study

Article 09 November 2022

Vitor Werner de Vargas, Jorge Arthur Schneider Aranda, … Jorge Luis Victória Barbosa

Predicting academic success in higher education: literature review and best practices

Article Open access 10 February 2020

Eyman Alyahyan & Dilek Düştegör

A Systematic Review on Supervised and Unsupervised Machine Learning Algorithms for Data Science

References

Absher, K., & Crawford, G. (1996). Marketing the community college starts with understanding students’ perspectives. Community College Review, 23(4), 59–68.
Google Scholar
Alpaydin, E. (2020). Introduction to machine learning. MIT press.
MATH Google Scholar
Anılan, H., Çemrek, F., & Anagün, Ş. S. (2008). Ortaöğretim öğrencilerinin meslek seçimi ve üniversite tercihlerine ilişkin görüşleri (Eskişehir örneği). E-journal of World Sciences Academy, 3(2), 238–249.
Google Scholar
Azizi, V., & Hu, G. (2019). Machine Learning Methods for Revenue Prediction in Google Merchandise Store. In INFORMS International Conference on Service Science (pp. 65–75). Springer.
Beaulac, C., & Rosenthal, J. S. (2019). Predicting university students’ academic success and major using random forests. Research in Higher Education, 60(7), 1048–1064.
Google Scholar
Bishop, C. M. (2006). Pattern recognition and machine learning. Springer.
MATH Google Scholar
Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5–32.
MATH Google Scholar
Breiman, L., Friedman, J., Stone, C. J., & Olshen, R. A. (1984). Classification and regression trees. CRC press.
MATH Google Scholar
Briggs, S. (2006). An exploratory study of the factors influencing undergraduate student choice: The case of higher education in Scotland. Studies in Higher Education, 31(6), 705–722.
Google Scholar
Bringula, R., & Basa, R. (2011). Institutional image indicators of three universities: Basis for attracting prospective entrants. Educational Research for Policy and Practice, 10, 53–72.
Google Scholar
Brown, C., Varley, P., & Pal, J. (2009). University course selection and services marketing. Marketing Intelligence & Planning, 27(3), 310–325. https://doi.org/10.1108/02634500910955227.
Article Google Scholar
Çakıt, E., & Karwowski, W. (2017a). Predicting the occurrence of adverse events using an adaptive neuro-fuzzy inference system (ANFIS) approach with the help of ANFIS input selection. Artificial Intelligence Review, 48(2), 139–155.
Google Scholar
Çakıt, E., & Karwowski, W. (2017b). Understanding the social and economic factors affecting adverse events in an active theater of war: a neural network approach. In International Conference on Applied Human Factors and Ergonomics (pp. 215–223). Springer.
Çakıt, E., Karwowski, W., Bozkurt, H., Ahram, T., Thompson, W., Mikusinski, P., & Lee, G. (2014). Investigating the relationship between adverse events and infrastructure development in an active war theater using soft computing techniques. Applied Soft Computing, 25, 204–214.
Google Scholar
Çakit, E., Durgun, B., & Cetik, O. (2015). A neural network approach for assessing the relationship between grip strength and hand anthropometry. Neural Network World, 25(6), 603.
Google Scholar
Çakıt, E., Karwowski, W., & Servi, L. (2020). Application of soft computing techniques for estimating emotional states expressed in twitter® time series data. Neural Computing and Applications, 32(8), 3535–3548.
Google Scholar
Capraro, A. J., Patrick, M. L., & Wilson, M. (2004). Attracting college candidates: The impact of perceived social life. Journal of Marketing for Higher Education, 14(1), 93–106.
Google Scholar
Chen, T., & Guestrin, C. (2016). Xgboost: A scalable tree boosting system. In Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining (pp. 785–794).
Chen, T., Xu, J., Ying, H., Chen, X., Feng, R., Fang, X., ... & Wu, J. (2019). Prediction of Extubation Failure for Intensive Care Unit Patients Using Light Gradient Boosting Machine. IEEE Access, 7, 150960–150968.
Cover, T., & Hart., P. (2006). Nearest neighbor pattern classification. IEEE Trans Inf Theor, 13(1), 21–27.
MATH Google Scholar
Dao, M. T. N., & Thorpe, A. (2015). What factors influence Vietnamese students’ choice of university? International Journal of Educational Management, 29(5), 666–681. https://doi.org/10.1108/IJEM-08-2014-0110.
Article Google Scholar
Drucker, H., Burges, C. J., Kaufman, L., Smola, A., & Vapnik, V. (1996). Support vector regression machines. Advances in Neural Information Processing Systems, 9, 155–161.
Google Scholar
Dudani, S. A. (1976). The distance-weighted k-nearest-neighbor rule. IEEE Transactions on Systems, Man, and Cybernetics, 4, 325–327.
Google Scholar
Esen, S. K. (2019). Understanding university choice decisions of Turkish students. In Handbook of research on contemporary approaches in management and organizational strategy (pp. 508–537). IGI Global.
Fausett, L. V. (2006). Fundamentals of neural networks: Architectures, algorithms and applications. Pearson Education India.
Fernández-Delgado, M., Cernadas, E., Barro, S., & Amorim, D. (2014). Do we need hundreds of classifiers to solve real world classification problems? The Journal of Machine Learning Research, 15(1), 3133–3181.
MathSciNet MATH Google Scholar
Friedman, J. H. (2001). Greedy function approximation: A gradient boosting machine. Annals of Statistics, 1189–1232.
Genuer, R., Poggi, J. M., Tuleau-Malot, C., & Villa-Vialaneix, N. (2017). Random forests for big data. Big Data Research, 9, 28–46.
Google Scholar
Guleria, P., & Sood, M. (2015). Predicting student placements using Bayesian classification. In 2015 Third International Conference on Image Information Processing (ICIIP) (pp. 109–112). IEEE.
Gurney, K. (1997). An introduction to neural networks. CRC press.
Google Scholar
Hao, J., & Ho, T. K. (2019). Machine learning made easy: A review of scikit-learn package in python programming language. Journal of Educational and Behavioral Statistics, 44(3), 348–361.
Google Scholar
Haykin, S. (2007). Neural networks: A comprehensive foundation. Prentice-Hall.
MATH Google Scholar
Iatrellis, O., Savvas, I. Κ., Fitsilis, P., & Gerogiannis, V. C. (2021). A two-phase machine learning approach for predicting student outcomes. Education and Information Technologies, 26(1), 69–88.
Google Scholar
Kalkan, S. B., Başar, Ö. D., & Özden, Ü. H. (2015). Üniversite tercihlerinde Urap sıralamasında kullanılan değişkenlerin etkilerinin genelleştirilmiş tahmin denklemleri ile incelenmesi. Marmara Üniversitesi İ.İ.B. Dergisi, 37(2), 95–110.
Google Scholar
Ke, G., Meng, Q., Finley, T., Wang, T., Chen, W., Ma, W., ... & Liu, T. Y. (2017). Lightgbm: A highly efficient gradient boosting decision tree. In Advances in neural information processing systems (pp. 3146–3154).
Lagrari, F. E., Ziyati, H., & El Kettani, Y. (2018). An Efficient model of text categorization based on feature selection and random forests: Case for business documents. In International conference on advanced intelligent systems for sustainable development (pp. 465–476). Springer.
Lykourentzou, I., Giannoukos, I., Nikolopoulos, V., Mpardis, G., & Loumos, V. (2009). Dropout prediction in e-learning courses through the combination of machine learning techniques. Computers & Education, 53(3), 950–965.
Google Scholar
Manvitha, P., & Swaroopa, N. (2019). Campus placement prediction using supervised machine learning techniques. International Journal of Applied Engineering Research, 14(9), 2188–2191.
Google Scholar
Marsland, S. (2015). Machine learning: An algorithmic perspective. CRC press.
Google Scholar
Mason, C., Twomey, J., Wright, D., & Whitman, L. (2018). Predicting engineering student attrition risk using a probabilistic neural network and comparing results with a backpropagation neural network and logistic regression. Research in Higher Education, 59(3), 382–400.
Google Scholar
Mohri, M., Rostamizadeh, A., & Talwalkar, A. (2018). Foundations of machine learning. MIT press.
MATH Google Scholar
Moogan, Y. J., Baron, S., & Harris, K. (1999). Decision-making behaviour of potential higher education students. Higher Education Quarterly, 53(3), 211–228. https://doi.org/10.1111/1468-2273.00127.
Article Google Scholar
Mostafa, L., & Beshir, S. (2021, June). University Selection Model Using Machine Learning Techniques. In The International Conference on Artificial Intelligence and Computer Vision (pp. 680–688). Springer.
Olcme, Secme ve Yerleştirme Merkezi (OSYM). (2020). Available online at: https://www.osym.gov.tr/ (Accessed 20 Dec 2020).
Pal, A. K., & Pal, S. (2013). Classification model of prediction for placement of students. International Journal of Modern Education and Computer Science, 5(11), 49.
Google Scholar
Pampaloni, A. M. (2010). The influence of organizational image on college selection: What students seek in institutions of higher education. Journal of Marketing for Higher Education, 20(1), 19–48.
Google Scholar
Park, T., & Kim, C. (2020). Predicting the variables that Determine University (re-) entrance as a career development using support vector machines with recursive feature elimination: The case of South Korea. Sustainability, 12(18), 7365.
Google Scholar
Pliakos, K., Joo, S. H., Park, J. Y., Cornillie, F., Vens, C., & Van den Noortgate, W. (2019). Integrating machine learning into item response theory for addressing the cold start problem in adaptive learning systems. Computers & Education, 137, 91–103.
Google Scholar
Prokhorenkova, L., Gusev, G., Vorobev, A., Dorogush, A. V., & Gulin, A. (2018). CatBoost: unbiased boosting with categorical features. In Advances in neural information processing systems (pp. 6638–6648).
Qin, Z., Myers, D. B., Ransom, C. J., Kitchen, N. R., Liang, S. Z., Camberato, J. J., ... & Shanahan, J. F. (2018). Application of machine learning methodologies for predicting corn economic optimal nitrogen rate. Agronomy Journal, 110(6), 2596–2607.
Ramasubramanian, S., Gyure, J. F., & Mursi, N. M. (2002). Impact of internet images: Impression-formation effects of university web site images. Journal of Marketing for Higher Education, 12(2), 49–68.
Google Scholar
Richardson, A., van Florenstein Mulder, T., & Vehbi, T. (2021). Nowcasting GDP using machine-learning algorithms: A real-time assessment. International Journal of Forecasting, 37(2), 941–948.
Google Scholar
Robinson, D. (2017). The incredible growth of Python. Retrieved from https://stackoverflow.blog/2017/09/06/incredible-growth-python/. Accessed 26 Mar 2021.
Scikit-Learn: Machine Learning in Python (2021). Available online at: http://scikit-learn.org/stable/ (Accessed 10 Jan 2021).
Sheela KG, Deepa SN (2013) Performance analysis of modeling framework for prediction in wind farms employing artificial neural networks. Soft Computing, 1–9.
Simões, C., & Soares, A. M. (2010). Applying to higher education: Information sources and choice factors. Studies in Higher Education, 35(4), 371–389. https://doi.org/10.1080/03075070903096490.
Article Google Scholar
Smola, A. J., & Schölkopf, B. (2004). A tutorial on support vector regression. Statistics and Computing, 14(3), 199–222.
MathSciNet Google Scholar
Soutar, G. N., & Turner, J. P. (2002). Students’ preferences for university: A conjoint analysis. International Journal of Educational Management, 16(1), 40–45. https://doi.org/10.1108/09513540210415523.
Article Google Scholar
Swamynathan, M. (2019). Mastering machine learning with python in six steps: A practical implementation guide to predictive data analytics using python. Apress.
Google Scholar
Tatar, E., & Oktay, M. (2006). Search, choice and persistence for higher education: A case study in Turkey. Eurasia Journal of Mathematics, Science and Technology Education, 2(2), 115–129.
Google Scholar
The University Ranking and Academic Performance (URAP). (2020) Available online at: https://www.urapcenter.org/ (Accessed 2 Dec 2020).
Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological), 58(1), 267–288.
MathSciNet MATH Google Scholar
Ustuner, M., & Balik Sanli, F. (2019). Polarimetric target decompositions and light gradient boosting machine for crop classification: A comparative evaluation. ISPRS International Journal of Geo-Information, 8(2), 97.
Google Scholar
Veloutsou, C., Lewis, J. W., & Paton, R. A. (2004). University selection: Information requirements and importance. International Journal of Educational Management, 18(3), 160–171. https://doi.org/10.1108/09513540410527158.
Article Google Scholar
Vrontos, S. D., Galakis, J., & Vrontos, I. D. (2021). Modeling and predicting US recessions using machine learning techniques. International Journal of Forecasting, 37(2), 647–671.
Google Scholar
Walsh, C., Moorhouse, J., Dunnett, A., & Barry, C. (2015). University choice: Which attributes matter when you are paying the full Price? International Journal of Consumer Studies, 39(6), 670–681. https://doi.org/10.1111/ijcs.12178.
Article Google Scholar
Wang, G., & Liu, Z. (2020). Android malware detection model based on lightgbm. In Recent Trends in Intelligent Computing, Communication and Devices (pp. 237–243). Springer.
Witten, I. H., & Frank, E. (2002). Data mining: Practical machine learning tools and techniques with Java implementations. ACM SIGMOD Record, 31(1), 76–77.
Google Scholar
Yousafzai, B. K., Hayat, M., & Afzal, S. (2020). Application of machine learning and data mining in predicting the performance of intermediate and secondary education level student. Education and Information Technologies, 25(6), 4677–4697.
Google Scholar
Yuksek Ogretim Kurulu (YOK). (2020). Available online at: https://www.yok.gov.tr/ (Accessed 20 Nov 2020).
Zhang, S., Hu, Q., Xie, Z., & Mi, J. (2015). Kernel ridge regression for general noise model with its application. Neurocomputing, 149, 836–846.
Google Scholar
Zou, H., & Hastie, T. (2005). Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 67(2), 301–320.
MathSciNet MATH Google Scholar
Zurada, J. M. (1992). Introduction to artificial neural systems (Vol. 8). West.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Industrial Engineering, Gazi University, 06570, Ankara, Turkey
Erman Çakıt & Metin Dağdeviren

Authors

Erman Çakıt
View author publications
You can also search for this author in PubMed Google Scholar
Metin Dağdeviren
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Erman Çakıt.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Çakıt, E., Dağdeviren, M. Predicting the percentage of student placement: A comparative study of machine learning algorithms. Educ Inf Technol 27, 997–1022 (2022). https://doi.org/10.1007/s10639-021-10655-4

Download citation

Received: 26 April 2021
Accepted: 24 June 2021
Published: 02 July 2021
Issue Date: January 2022
DOI: https://doi.org/10.1007/s10639-021-10655-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Predicting the percentage of student placement: A comparative study of machine learning algorithms

Abstract

Access this article

Similar content being viewed by others

Imbalanced data preprocessing techniques for machine learning: a systematic mapping study

Predicting academic success in higher education: literature review and best practices

A Systematic Review on Supervised and Unsupervised Machine Learning Algorithms for Data Science

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Predicting the percentage of student placement: A comparative study of machine learning algorithms

Abstract

Access this article

Similar content being viewed by others

Imbalanced data preprocessing techniques for machine learning: a systematic mapping study

Predicting academic success in higher education: literature review and best practices

A Systematic Review on Supervised and Unsupervised Machine Learning Algorithms for Data Science

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation