University admission process: a prescriptive analytics approach

Kiaghadi, Mohammadreza; Hoseinpour, Pooya

doi:10.1007/s10462-022-10171-y

University admission process: a prescriptive analytics approach

Published: 19 August 2022

Volume 56, pages 233–256, (2023)
Cite this article

Artificial Intelligence Review Aims and scope Submit manuscript

Mohammadreza Kiaghadi¹ &
Pooya Hoseinpour¹

829 Accesses
6 Citations
2 Altmetric
Explore all metrics

Abstract

Students typically do not have practical tools to help them choose their target universities to apply. This work proposes a comprehensive analytics framework as a decision support tool that assists students in their admission process. As an essential element of the developed framework, a prediction procedure is developed to precisely determine the student's chance of admission to each university using various machine learning methods. It is concluded that random forest combined with kernel principal component analysis outperforms other prediction models. Besides, an online survey is built to elicit the utility of the student regarding each university. A mathematical programming model is then proposed to determine the best universities to apply among the candidates considering the probable limitations; the most important is the student's budget. The model is also extended to consider multiple objectives for making decisions. Last, a case study is provided to show the practicality of the developed decision support tool.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Using Student Characteristics to Promote Student Success at Higher-education Institutions

Predictive Analytics of Engineering and Technology Admissions

Predicting Student Admissions Rate into University Using Machine Learning Models

Notes

References

Abbas AE (2010) Constructing multiattribute utility functions for decision analysis. In: Risk and optimization in an uncertain world. In: INFORMS, pp 62–98
Achabal DD, McIntyre SH, Smith SA, Kalyanam K (2000) A decision support system for vendor managed inventory. J Retail 76(4):430–454
Article Google Scholar
Acharya MS, Armaan A, Antony AS (2019) A comparison of regression models for prediction of graduate admissions. In: 2019 International conference on computational intelligence in data science (ICCIDS). IEEE, pp 1–5.
Adekitan AI, Noma-Osaghae E (2019) Data mining approach to predicting the performance of first year student in a university using the admission requirements. Educ Inf Technol 24(2):1527–1543
Article Google Scholar
Albawi S, Mohammed TA, Al-Zawi S (2017) Understanding of a convolutional neural network. In 2017 International conference on engineering and technology. IEEE, pp 1–6
Asif R, Merceron A, Ali SA, Haider NG (2017) Analyzing undergraduate students’ performance using educational data mining. Comput Educ 113:177–194
Article Google Scholar
Audet C, Hare W (2017) Biobjective optimization. In: Derivative-free and blackbox optimization. Springer, New York, pp 247–262
Baucells M, Sarin RK (2003) Group decisions with multiple criteria. Manage Sci 49(8):1105–1118
Article MATH Google Scholar
Belloni A, Lovett MJ, Boulding W, Staelin R (2012) Optimal admission and scholarship decisions: choosing customized marketing offers to attract a desirable mix of customers. Mark Sci 31(4):621–636
Article Google Scholar
Board S (2009) Preferences and utility. UCLA, Los Angeles
Google Scholar
Chui KT, Fung DCL, Lytras MD, Lam TM (2020) Predicting at-risk university students in a virtual learning environment via a machine learning algorithm. Comput Hum Behav 107:105584
Article Google Scholar
Ding L (2019) Theoretical perspectives of quantitative physics education research. Phys Rev Phys Educ Res 15(2):020101
Article Google Scholar
Dumitrescu E, Hue S, Hurlin C, Tokpavi S (2022) Machine learning for credit scoring: Improving logistic regression with non-linear decision-tree effects. Eur J Oper Res 297(3):1178–1192
Article MathSciNet MATH Google Scholar
Egorow O, Siegert I, Wendemuth A (2018) Improving emotion recognition performance by random-forest-based feature selection. In: International conference on speech and computer. Springer, Berlin, pp 134–144
Esteban A, Zafra A, Romero C (2020) Helping university students to choose elective courses by using a hybrid multi-criteria recommendation system with genetic optimization. Knowl Based Syst 194:105385
Article Google Scholar
Ghai B (2015) Analysis & prediction of american graduate admissions process. Stony Brook University, Department of Computer Science
Google Scholar
Gharroudi O, Elghazel H, Aussem A (2014) A comparison of multi-label feature selection methods using the random forest paradigm. In: Canadian conference on artificial intelligence. Springer, New York, pp 95–106
Ghodsypour SH, O’Brien C (1998) A decision support system for supplier selection using an integrated analytic hierarchy process and linear programming. Int J Prod Econ 56:199–212
Article Google Scholar
Gray CC, Perkins D (2019) Utilizing early engagement and machine learning to predict student outcomes. Comput Educ 131:22–32
Article Google Scholar
Gupta N, Sawhney A, Roth D (2016) Will I get in? Modeling the graduate admission process for American universities. In 2016 IEEE 16th international conference on data mining workshops (ICDMW). IEEE, pp 631–638
Helal S, Li J, Liu L, Ebrahimie E, Dawson S, Murray DJ, Long Q (2018) Predicting academic performance by considering student heterogeneity. Knowl-Based Syst 161:134–146
Article Google Scholar
Hoffait A-S, Schyns M (2017) Early detection of university students with potential difficulties. Decis Support Syst 101:1–11
Article Google Scholar
Hussain M, Zhu W, Zhang W, Abidi SMR, Ali S (2019) Using machine learning to predict student difficulties from learning session data. Artif Intell Rev 52(1):381–407
Article Google Scholar
Injadat M, Moubayed A, Nassif AB, Shami A (2020) Systematic ensemble model selection approach for educational data mining. Knowl Based Syst 200:105992
Article Google Scholar
Jansen SJ (2011) The multi-attribute utility method. In: Jansen SJT et al (eds) The measurement and analysis of housing preference and choice. Springer, New York, pp 101–125
Chapter Google Scholar
Kaur P, Gosain A (2020) Robust hybrid data-level sampling approach to handle imbalanced data during classification. Soft Comput 24(20):15715–15732
Article Google Scholar
Kim D, Kim N, Cho J, Shin H (2019) Optimizing the multistage university admission decision process. INFORMS J Appl Anal 49(6):422–429
Article Google Scholar
Kotsiantis SB (2012) Use of machine learning techniques for educational proposes: a decision support system for forecasting students’ grades. Artif Intell Rev 37(4):331–344
Article Google Scholar
Kutner MH, Nachtsheim CJ, Neter J, Li W (2005) Applied linear statistical models, 5th edn. McGraw-Hill/Irwin, New York
Google Scholar
Li S, Harner EJ, Adjeroh DA (2011) Random KNN feature selection-a fast and stable alternative to Random Forests. BMC Bioinformatics 12(1):1–11
Article Google Scholar
Lykourentzou I, Giannoukos I, Nikolopoulos V, Mpardis G, Loumos V (2009) Dropout prediction in e-learning courses through the combination of machine learning techniques. Comput Educ 53(3):950–965
Article Google Scholar
Maldonado S, Armelini G, Guevara CA (2017) Assessing university enrollment and admission efforts via hierarchical classification and feature selection. Intelligent Data Analysis 21(4):945–962
Article Google Scholar
Maltz EN, Murphy KE, Hand ML (2007) Decision support for university enrollment management: Implementation and experience. Decis Support Syst 44(1):106–123
Article Google Scholar
Mansmann S, Scholl MH (2007) Decision support system for managing educational capacity utilization. IEEE Trans Educ 50(2):143–150
Article Google Scholar
Mengash HA (2020) Using data mining techniques to predict student performance to support decision making in university admission systems. IEEE Access 8:55462–55470
Article Google Scholar
Moore JS (1998) An expert system approach to graduate school admission decisions and academic performance prediction. Omega 26(5):659–670
Article Google Scholar
Moxnes E (2004) Estimating customer utility of energy efficiency standards for refrigerators. J Econ Psychol 25(6):707–724
Article Google Scholar
Ngai EW, Wat F (2005) Fuzzy decision support system for risk analysis in e-commerce development. Decis Support Syst 40(2):235–255
Article Google Scholar
Nissen J, Donatello R, Van Dusen B (2019) Missing data and bias in physics education research: a case for using multiple imputation. Phys Rev Phys Educ Res 15(2):020106
Article Google Scholar
Partridge M, Calvo RA (1998) Fast dimensionality reduction and simple PCA. Intell Data Anal 2(3):203–214
Article Google Scholar
Picard RR, Cook RD (1984) Cross-validation of regression models. J Am Stat Assoc 79(387):575–583
Article MathSciNet MATH Google Scholar
Probst P, Wright MN, Boulesteix AL (2019) Hyperparameters and tuning strategies for random forest. Data Min Knowl Discov 9(3):e1301
Google Scholar
Ragab AHM, Mashat AFS, Khedra AM (2012) HRSPCA: hybrid recommender system for predicting college admission. In: 2012 12th International conference on intelligent systems design and applications (ISDA). IEEE, pp 107–113
Raschka S (2018) Model evaluation, model selection, and algorithm selection in machine learning. https://arxiv.org/abs/1811.12808
Rodriguez JD, Perez A, Lozano JA (2009) Sensitivity analysis of k-fold cross validation in prediction error estimation. IEEE Trans Pattern Anal Mach Intell 32(3):569–575
Article Google Scholar
Rutkowski L, Jaworski M, Pietruczuk L, Duda P (2014) The CART decision tree for mining data streams. Inf Sci 266:1–15
Article MATH Google Scholar
Speiser JL, Miller ME, Tooze J, Ip E (2019) A comparison of random forest variable selection methods for classification prediction modeling. Expert Syst Appl 134:93–101
Article Google Scholar
Springuel RP, Wittmann MC, Thompson JR (2019) Reconsidering the encoding of data in physics education research. Phys Rev Phys Educ Res 15(2):020103
Article Google Scholar
Stone M (1978) Cross-validation: a review. Statistics 9(1):127–139
MathSciNet MATH Google Scholar
Van Dusen B, Nissen J (2019) Modernizing use of regression models in physics education research: a review of hierarchical linear modeling. Phys Rev Phys Educ Res 15(2):020108
Article Google Scholar
Walczak S, Sincich T (1999) A comparative analysis of regression and neural networks for university admissions. Inf Sci 119(1–2):1–20
Article Google Scholar
Waters A, Miikkulainen R (2014) Grade: machine learning support for graduate admissions. AI Mag 35(1):64–64
Google Scholar
Wu H, Lin A, Xing X, Song D, Li Y (2021) Identifying core driving factors of urban land use change from global land cover products and POI data using the random forest method. Int J Appl Earth Observ Geoinform 103:102475
Article Google Scholar
Young NT, Caballero MD (2019) Using machine learning to understand physics graduate school admissions. https://arxiv.org/abs/1907.01570.

Download references

Author information

Authors and Affiliations

Department of Industrial Engineering & Management Systems, Amirkabir University of Technology (Tehran Polytechnic), Tehran, Iran
Mohammadreza Kiaghadi & Pooya Hoseinpour

Authors

Mohammadreza Kiaghadi
View author publications
You can also search for this author in PubMed Google Scholar
Pooya Hoseinpour
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Pooya Hoseinpour.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix A: Details on preprocessing

Earlier in Sect. 4.1. it was discovered that the dataset used in this framework is clean and has no missing values, but what if this wasn’t the case? If a dataset is not clean and contains some missing data, there are multiple ways to solve the problem and handle this discrepancy. These solutions include deletion methods to eliminate missing data, replacing the missing data with the mean of that column, predicting the missing values, etc. To check the impact of the missing data on the prediction models, a new dataset is created from the original dataset by intentionally missing 10% of the data. Then, the first two abovementioned methods, i.e., replacing the missing data with the mean of that column and eliminating missing data methods, are used to handle these missing data. The results of the R-square values of the prediction models on the new dataset are reported in Table 11.

Table 11 Comparison of prediction methods on new dataset, including missing values

Full size table

11.

Comparing Table 11 with the original Table IV shows that missing data in this dataset does not significantly change the results. Therefore, the Random Forest remains the selected method for prediction in this work, even in the presence of missing data since it has the best R-square value. Interested readers regarding handling missing data are referred to, e.g., Springuel et al. (2019) and Nissen et al. (2019). After scaling the data, the calculated mean of the chance of admitting column is 0.61, as seen in Table III, which indicates that the data is nearly balanced and there is no need to handle imbalanced data. What if this was not the case and the data were imbalanced as well? In that case, there are multiple ways to handle imbalanced data, for instance, resample the training set, under-sampling, over-sampling, and more; refer to Kaur and Gosain (2020) for more detail.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Kiaghadi, M., Hoseinpour, P. University admission process: a prescriptive analytics approach. Artif Intell Rev 56, 233–256 (2023). https://doi.org/10.1007/s10462-022-10171-y

Download citation

Accepted: 11 March 2022
Published: 19 August 2022
Issue Date: January 2023
DOI: https://doi.org/10.1007/s10462-022-10171-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

University admission process: a prescriptive analytics approach

Abstract

Access this article

Similar content being viewed by others

Using Student Characteristics to Promote Student Success at Higher-education Institutions

Predictive Analytics of Engineering and Technology Admissions

Predicting Student Admissions Rate into University Using Machine Learning Models

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendix A: Details on preprocessing

Rights and permissions

About this article

Cite this article

Keywords

Navigation

University admission process: a prescriptive analytics approach

Abstract

Access this article

Similar content being viewed by others

Using Student Characteristics to Promote Student Success at Higher-education Institutions

Predictive Analytics of Engineering and Technology Admissions

Predicting Student Admissions Rate into University Using Machine Learning Models

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendix A: Details on preprocessing

Appendix A: Details on preprocessing

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation