Skip to main content
Log in

University admission process: a prescriptive analytics approach

  • Published:
Artificial Intelligence Review Aims and scope Submit manuscript

Abstract

Students typically do not have practical tools to help them choose their target universities to apply. This work proposes a comprehensive analytics framework as a decision support tool that assists students in their admission process. As an essential element of the developed framework, a prediction procedure is developed to precisely determine the student's chance of admission to each university using various machine learning methods. It is concluded that random forest combined with kernel principal component analysis outperforms other prediction models. Besides, an online survey is built to elicit the utility of the student regarding each university. A mathematical programming model is then proposed to determine the best universities to apply among the candidates considering the probable limitations; the most important is the student's budget. The model is also extended to consider multiple objectives for making decisions. Last, a case study is provided to show the practicality of the developed decision support tool.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Notes

  1. https://www.usnews.com/education/best-colleges/articles/2016-09-22/how-competitive-is-college-admissions

  2. https://www.admissionpros.com/blog/top-4-challenges-for-student-recruitment-professionals

  3. https://www.unigo.com/

  4. https://forms.gle/y3NoD2sDjs6cstGd6

References

  • Abbas AE (2010) Constructing multiattribute utility functions for decision analysis. In: Risk and optimization in an uncertain world. In: INFORMS, pp 62–98

  • Achabal DD, McIntyre SH, Smith SA, Kalyanam K (2000) A decision support system for vendor managed inventory. J Retail 76(4):430–454

    Article  Google Scholar 

  • Acharya MS, Armaan A, Antony AS (2019) A comparison of regression models for prediction of graduate admissions. In: 2019 International conference on computational intelligence in data science (ICCIDS). IEEE, pp 1–5.

  • Adekitan AI, Noma-Osaghae E (2019) Data mining approach to predicting the performance of first year student in a university using the admission requirements. Educ Inf Technol 24(2):1527–1543

    Article  Google Scholar 

  • Albawi S, Mohammed TA, Al-Zawi S (2017) Understanding of a convolutional neural network. In 2017 International conference on engineering and technology. IEEE, pp 1–6

  • Asif R, Merceron A, Ali SA, Haider NG (2017) Analyzing undergraduate students’ performance using educational data mining. Comput Educ 113:177–194

    Article  Google Scholar 

  • Audet C, Hare W (2017) Biobjective optimization. In: Derivative-free and blackbox optimization. Springer, New York, pp 247–262

  • Baucells M, Sarin RK (2003) Group decisions with multiple criteria. Manage Sci 49(8):1105–1118

    Article  MATH  Google Scholar 

  • Belloni A, Lovett MJ, Boulding W, Staelin R (2012) Optimal admission and scholarship decisions: choosing customized marketing offers to attract a desirable mix of customers. Mark Sci 31(4):621–636

    Article  Google Scholar 

  • Board S (2009) Preferences and utility. UCLA, Los Angeles

    Google Scholar 

  • Chui KT, Fung DCL, Lytras MD, Lam TM (2020) Predicting at-risk university students in a virtual learning environment via a machine learning algorithm. Comput Hum Behav 107:105584

    Article  Google Scholar 

  • Ding L (2019) Theoretical perspectives of quantitative physics education research. Phys Rev Phys Educ Res 15(2):020101

    Article  Google Scholar 

  • Dumitrescu E, Hue S, Hurlin C, Tokpavi S (2022) Machine learning for credit scoring: Improving logistic regression with non-linear decision-tree effects. Eur J Oper Res 297(3):1178–1192

    Article  MathSciNet  MATH  Google Scholar 

  • Egorow O, Siegert I, Wendemuth A (2018) Improving emotion recognition performance by random-forest-based feature selection. In: International conference on speech and computer. Springer, Berlin, pp 134–144

  • Esteban A, Zafra A, Romero C (2020) Helping university students to choose elective courses by using a hybrid multi-criteria recommendation system with genetic optimization. Knowl Based Syst 194:105385

    Article  Google Scholar 

  • Ghai B (2015) Analysis & prediction of american graduate admissions process. Stony Brook University, Department of Computer Science

    Google Scholar 

  • Gharroudi O, Elghazel H, Aussem A (2014) A comparison of multi-label feature selection methods using the random forest paradigm. In: Canadian conference on artificial intelligence. Springer, New York, pp 95–106

  • Ghodsypour SH, O’Brien C (1998) A decision support system for supplier selection using an integrated analytic hierarchy process and linear programming. Int J Prod Econ 56:199–212

    Article  Google Scholar 

  • Gray CC, Perkins D (2019) Utilizing early engagement and machine learning to predict student outcomes. Comput Educ 131:22–32

    Article  Google Scholar 

  • Gupta N, Sawhney A, Roth D (2016) Will I get in? Modeling the graduate admission process for American universities. In 2016 IEEE 16th international conference on data mining workshops (ICDMW). IEEE, pp 631–638

  • Helal S, Li J, Liu L, Ebrahimie E, Dawson S, Murray DJ, Long Q (2018) Predicting academic performance by considering student heterogeneity. Knowl-Based Syst 161:134–146

    Article  Google Scholar 

  • Hoffait A-S, Schyns M (2017) Early detection of university students with potential difficulties. Decis Support Syst 101:1–11

    Article  Google Scholar 

  • Hussain M, Zhu W, Zhang W, Abidi SMR, Ali S (2019) Using machine learning to predict student difficulties from learning session data. Artif Intell Rev 52(1):381–407

    Article  Google Scholar 

  • Injadat M, Moubayed A, Nassif AB, Shami A (2020) Systematic ensemble model selection approach for educational data mining. Knowl Based Syst 200:105992

    Article  Google Scholar 

  • Jansen SJ (2011) The multi-attribute utility method. In: Jansen SJT et al (eds) The measurement and analysis of housing preference and choice. Springer, New York, pp 101–125

    Chapter  Google Scholar 

  • Kaur P, Gosain A (2020) Robust hybrid data-level sampling approach to handle imbalanced data during classification. Soft Comput 24(20):15715–15732

    Article  Google Scholar 

  • Kim D, Kim N, Cho J, Shin H (2019) Optimizing the multistage university admission decision process. INFORMS J Appl Anal 49(6):422–429

    Article  Google Scholar 

  • Kotsiantis SB (2012) Use of machine learning techniques for educational proposes: a decision support system for forecasting students’ grades. Artif Intell Rev 37(4):331–344

    Article  Google Scholar 

  • Kutner MH, Nachtsheim CJ, Neter J, Li W (2005) Applied linear statistical models, 5th edn. McGraw-Hill/Irwin, New York

    Google Scholar 

  • Li S, Harner EJ, Adjeroh DA (2011) Random KNN feature selection-a fast and stable alternative to Random Forests. BMC Bioinformatics 12(1):1–11

    Article  Google Scholar 

  • Lykourentzou I, Giannoukos I, Nikolopoulos V, Mpardis G, Loumos V (2009) Dropout prediction in e-learning courses through the combination of machine learning techniques. Comput Educ 53(3):950–965

    Article  Google Scholar 

  • Maldonado S, Armelini G, Guevara CA (2017) Assessing university enrollment and admission efforts via hierarchical classification and feature selection. Intelligent Data Analysis 21(4):945–962

    Article  Google Scholar 

  • Maltz EN, Murphy KE, Hand ML (2007) Decision support for university enrollment management: Implementation and experience. Decis Support Syst 44(1):106–123

    Article  Google Scholar 

  • Mansmann S, Scholl MH (2007) Decision support system for managing educational capacity utilization. IEEE Trans Educ 50(2):143–150

    Article  Google Scholar 

  • Mengash HA (2020) Using data mining techniques to predict student performance to support decision making in university admission systems. IEEE Access 8:55462–55470

    Article  Google Scholar 

  • Moore JS (1998) An expert system approach to graduate school admission decisions and academic performance prediction. Omega 26(5):659–670

    Article  Google Scholar 

  • Moxnes E (2004) Estimating customer utility of energy efficiency standards for refrigerators. J Econ Psychol 25(6):707–724

    Article  Google Scholar 

  • Ngai EW, Wat F (2005) Fuzzy decision support system for risk analysis in e-commerce development. Decis Support Syst 40(2):235–255

    Article  Google Scholar 

  • Nissen J, Donatello R, Van Dusen B (2019) Missing data and bias in physics education research: a case for using multiple imputation. Phys Rev Phys Educ Res 15(2):020106

    Article  Google Scholar 

  • Partridge M, Calvo RA (1998) Fast dimensionality reduction and simple PCA. Intell Data Anal 2(3):203–214

    Article  Google Scholar 

  • Picard RR, Cook RD (1984) Cross-validation of regression models. J Am Stat Assoc 79(387):575–583

    Article  MathSciNet  MATH  Google Scholar 

  • Probst P, Wright MN, Boulesteix AL (2019) Hyperparameters and tuning strategies for random forest. Data Min Knowl Discov 9(3):e1301

    Google Scholar 

  • Ragab AHM, Mashat AFS, Khedra AM (2012) HRSPCA: hybrid recommender system for predicting college admission. In: 2012 12th International conference on intelligent systems design and applications (ISDA). IEEE, pp 107–113

  • Raschka S (2018) Model evaluation, model selection, and algorithm selection in machine learning. https://arxiv.org/abs/1811.12808

  • Rodriguez JD, Perez A, Lozano JA (2009) Sensitivity analysis of k-fold cross validation in prediction error estimation. IEEE Trans Pattern Anal Mach Intell 32(3):569–575

    Article  Google Scholar 

  • Rutkowski L, Jaworski M, Pietruczuk L, Duda P (2014) The CART decision tree for mining data streams. Inf Sci 266:1–15

    Article  MATH  Google Scholar 

  • Speiser JL, Miller ME, Tooze J, Ip E (2019) A comparison of random forest variable selection methods for classification prediction modeling. Expert Syst Appl 134:93–101

    Article  Google Scholar 

  • Springuel RP, Wittmann MC, Thompson JR (2019) Reconsidering the encoding of data in physics education research. Phys Rev Phys Educ Res 15(2):020103

    Article  Google Scholar 

  • Stone M (1978) Cross-validation: a review. Statistics 9(1):127–139

    MathSciNet  MATH  Google Scholar 

  • Van Dusen B, Nissen J (2019) Modernizing use of regression models in physics education research: a review of hierarchical linear modeling. Phys Rev Phys Educ Res 15(2):020108

    Article  Google Scholar 

  • Walczak S, Sincich T (1999) A comparative analysis of regression and neural networks for university admissions. Inf Sci 119(1–2):1–20

    Article  Google Scholar 

  • Waters A, Miikkulainen R (2014) Grade: machine learning support for graduate admissions. AI Mag 35(1):64–64

    Google Scholar 

  • Wu H, Lin A, Xing X, Song D, Li Y (2021) Identifying core driving factors of urban land use change from global land cover products and POI data using the random forest method. Int J Appl Earth Observ Geoinform 103:102475

    Article  Google Scholar 

  • Young NT, Caballero MD (2019) Using machine learning to understand physics graduate school admissions. https://arxiv.org/abs/1907.01570.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Pooya Hoseinpour.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix A: Details on preprocessing

Appendix A: Details on preprocessing

Earlier in Sect. 4.1. it was discovered that the dataset used in this framework is clean and has no missing values, but what if this wasn’t the case? If a dataset is not clean and contains some missing data, there are multiple ways to solve the problem and handle this discrepancy. These solutions include deletion methods to eliminate missing data, replacing the missing data with the mean of that column, predicting the missing values, etc. To check the impact of the missing data on the prediction models, a new dataset is created from the original dataset by intentionally missing 10% of the data. Then, the first two abovementioned methods, i.e., replacing the missing data with the mean of that column and eliminating missing data methods, are used to handle these missing data. The results of the R-square values of the prediction models on the new dataset are reported in Table 11.

Table 11 Comparison of prediction methods on new dataset, including missing values

11.

Comparing Table 11 with the original Table IV shows that missing data in this dataset does not significantly change the results. Therefore, the Random Forest remains the selected method for prediction in this work, even in the presence of missing data since it has the best R-square value. Interested readers regarding handling missing data are referred to, e.g., Springuel et al. (2019) and Nissen et al. (2019). After scaling the data, the calculated mean of the chance of admitting column is 0.61, as seen in Table III, which indicates that the data is nearly balanced and there is no need to handle imbalanced data. What if this was not the case and the data were imbalanced as well? In that case, there are multiple ways to handle imbalanced data, for instance, resample the training set, under-sampling, over-sampling, and more; refer to Kaur and Gosain (2020) for more detail.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kiaghadi, M., Hoseinpour, P. University admission process: a prescriptive analytics approach. Artif Intell Rev 56, 233–256 (2023). https://doi.org/10.1007/s10462-022-10171-y

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10462-022-10171-y

Keywords

Navigation