Skip to main content

Advertisement

Log in

Clustering-based knowledge graphs and entity-relation representation improves the detection of at risk students

  • Published:
Education and Information Technologies Aims and scope Submit manuscript

Abstract

The nature of education has been transformed by technological advances and online learning platforms, providing educational institutions with more options than ever to thrive in a complex and competitive environment. However, they still face challenges such as academic underachievement, graduation delays, and student dropouts. Fortunately, by harnessing student data from institution databases and online platforms, it becomes possible to predict the academic performance of individual students at an early stage. In this study, we utilized knowledge graphs (KG), clustering, and machine learning (ML) techniques on data related to students in the College of Information Technology at UAEU. To construct knowledge graphs and visualize students’ performance at various checkpoints, we employed Neo4j-a high-performance NoSQL graph database. The findings demonstrate that incorporating clustered knowledge graphs with machine learning reduces predictive errors, enhances classification accuracy, and effectively identifies students at risk of course failure. Additionally, the utilization of visualization methods facilitates communication and decision-making within educational institutions. The combination of KGs and ML empowers course instructors to rank students and provide personalized learning interventions based on individual performance and capabilities, allowing them to develop tailored remedial actions for at-risk students according to their unique profiles.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Availability of data and materials

The data of this study “EduRisk” is available from the corresponding author upon reasonable request through bi-dac.com/download

Abbreviations

Acc:

Accuracy

AdaBoost:

Adaptive Boosting

AI:

Artificial Intelligence

CP:

Checkpoints

DL:

Deep Learning

DS:

Dataset Features

EF:

Embedded Features

GCN:

Graph Convolution Neural Network

HF:

Historical Features

KG:

Knowledge Graph

KGE:

Knowledge Graph Embeddings

LGBM:

Light Gradient Boosting Machine

LR:

Linear Rregression

MAE:

Mean Absolute Error

ML:

Machine Learning

MSE:

Mean Squared Error

NoSQL:

Not Only SQL

PCA:

Principal Component Analysis

RF:

Random Forest

SVM:

Support Vector Machine

TG:

Total Grade

TPR:

True Positive Rate

HW:

Homework assignment

LGB:

Light Gradient Boosting

References

  • Acharya, A., & Sinha, D. (2014). Early prediction of students performance using machine learning techniques. International Journal of Computer Applications, 107(1)

  • Adekitan, A. I., & Noma-Osaghae, E. (2019). Data mining approach to predicting the performance of first year student in a university using the admission requirements. Education and Information Technologies, 24(2), 1527–1543.

    Article  Google Scholar 

  • Ahadi, A., Lister, R., & Haapala, H., et al. (2015). Exploring machine learning methods to automatically identify students in need of assistance. In: Proceedings of the eleventh annual international conference on international computing education research (pp. 121–130)

  • Ahmad, Z., & Shahzadi, E. (2018). Prediction of students’ academic performance using artificial neural network. Bulletin of Education and Research, 40(3), 157–164.

    Google Scholar 

  • Al Breiki, B., Zaki, N., & Mohamed, E. A. (2019). Using educational data mining techniques to predict student performance. In: 2019 International conference on electrical and computing technologies and applications (ICECTA), IEEE, (pp. 1–5)

  • Albreiki, B., Habuza, T., Shuqfa, Z., et al. (2021). Customized rule-based model to identify at-risk students and propose rational remedial actions. Big Data and Cognitive Computing, 5(4), 71.

    Article  Google Scholar 

  • Albreiki, B., Habuza, T., & Zaki, N. (2022). Framework for automatically suggesting remedial actions to help students at risk based on explainable ml and rule-based models. International Journal of Educational Technology in Higher Education, 19(1), 1–26.

    Google Scholar 

  • Albreiki, B., Habuza, T., & Zaki, N. (2023). Extracting topological features to identify at-risk students using machine learning and graph convolutional network models. International Journal of Educational Technology in Higher Education, 20(1), 1–22.

    Article  Google Scholar 

  • Aleem, A., & Gore, M. M. (2020). Educational data mining methods: A survey. In: 2020 IEEE 9th International conference on communication systems and network technologies (CSNT), IEEE, (pp. 182–188)

  • Almarabeh, H. (2017). Analysis of students’ performance by using different data mining classifiers. International Journal of Modern Education and Computer Science, 9(8), 9.

    Article  Google Scholar 

  • Alshanqiti, A., & Namoun, A. (2020). Predicting student performance and its influential factors using hybrid regression and multi-label classification. IEEE Access, 8, 203,827–203,844

  • Al-Shehri, H., Al-Qarni, A., & Al-Saati, L., et al. (2017). Student performance prediction using support vector machine and k-nearest neighbor. In: 2017 IEEE 30th canadian conference on electrical and computer engineering (CCECE), IEEE, (pp. 1–4)

  • Amador-Domínguez, E., Serrano, E., Manrique, D., et al. (2019). Prediction and decision-making in intelligent environments supported by knowledge graphs, a systematic review. Sensors, 19(8), 1774.

    Article  Google Scholar 

  • Baradwaj, BK., & Pal, S. (2012). Mining educational data to analyze students’ performance. arXiv preprint arXiv:1201.3417

  • Binkhonain, M., & Zhao, L. (2019). A review of machine learning algorithms for identification and classification of non-functional requirements. Expert Systems with Applications X, 1(100), 001.

    Google Scholar 

  • Bordes, A., Usunier, N., & Garcia-Duran, A., et al .(2013). Translating embeddings for modeling multi-relational data. Advances in neural information processing systems 26

  • Buenaño-Fernández, D., Gil, D., & Luján-Mora, S. (2019). Application of machine learning in predicting performance for computer engineering students: A case study. Sustainability, 11(10), 2833.

    Article  Google Scholar 

  • Chen, H. C., Prasetyo, E., & Tseng, S. S., et al. (2022) Week-wise student performance early prediction in virtual learning environment using a deep explainable artificial intelligence. Applied Sciences, 12(4), 1885

  • Chicaiza, J., & Valdiviezo-Diaz, P. (2021). A comprehensive survey of knowledge graph-based recommender systems: Technologies, development, and contributions. Information, 12(6), 232.

    Article  Google Scholar 

  • Chowdhury, F. R. R., Ma, C., & Islam, M. R., et al. (2017) Select-and-evaluate: A learning framework for large-scale knowledge graph search. In: Asian conference on machine learning, PMLR, (pp 129–144)

  • Chui, K. T., Fung, D. C. L., Lytras, M. D., et al. (2020). Predicting at-risk university students in a virtual learning environment via a machine learning algorithm. Computers in Human Behavior, 107(105), 584.

    Google Scholar 

  • Crivei, L. M., Ionescu, V. S., & Czibula, G. (2019). An analysis of supervised learning methods for predicting students’ performance in academic environments. ICIC Express Lett, 13(3), 181–189.

    Google Scholar 

  • Cui, J., & Yu, S. (2019). Fostering deeper learning in a flipped classroom: Effects of knowledge graphs versus concept maps. British Journal of Educational Technology, 2019, 1–21.

    Google Scholar 

  • Deng, Y., Zeng, Z., Jha, K., et al. (2022). Problem-based cybersecurity lab with knowledge graph as guidance. Journal of Artificial Intelligence and Technology, 2(2), 55–61.

    Google Scholar 

  • Dhanalakshmi, V., Bino, D., & Saravanan, A. M. (2016). Opinion mining from student feedback data using supervised learning algorithms. In: 2016 3rd MEC international conference on big data and smart city (ICBDSC), IEEE, (pp. 1–5)

  • Donato, R. D., Garofalo, M., & Malandrino, D., et al. (2020). Education meets knowledge graphs for the knowledge management. In: International conference in methodologies and intelligent systems for techhnology enhanced learning (pp 272–280). Springer

  • Ehrlinger, L., & Wöß, W. (2016). Towards a definition of knowledge graphs. SEMANTiCS (Posters, Demos, SuCCESS), 48(1–4), 2.

    Google Scholar 

  • Ettorre, A., Bobasheva, A., & Michel, F., et al. (2022). Stunning doodle: a tool for joint visualization and analysis of knowledge graphs and graph embeddings. In: European semantic web conference (pp 370–386). Springer

  • Fahd, K., Venkatraman, S., & Miah, S. J., et al. (2021). Application of machine learning in higher education to assess student academic performance, at-risk, and attrition: A meta-analysis of literature. Education and Information Technologies (pp. 1–33)

  • Faria, J. R., Wanke, P. F., Ferreira, J. J., et al. (2018). Research and innovation in higher education: empirical evidence from research and patenting in Brazil. Scientometrics, 116(1), 487–504.

    Article  Google Scholar 

  • Fei, M., & Yeung, D. Y. (2015). Temporal models for predicting student dropout in massive open online courses. In: 2015 IEEE International conference on data mining workshop (ICDMW) (pp. 256–263). IEEE

  • Feng, W., Tang, J., & Liu, T. X. (2019). Understanding dropouts in moocs. In: Proceedings of the AAAI Conference on Artificial Intelligence (pp. 517–524)

  • Ferguson, R. (2012). Learning analytics: drivers, developments and challenges. International Journal of Technology Enhanced Learning, 4(5/6), 304–317.

    Article  Google Scholar 

  • Gafarov, F., Rudneva, Y. B., & Sharifov, U. Y., et al. (2020). Analysis of students’ academic performance by using machine learning tools. In: International scientific conference digitalization of education: history, trends and prospects (DETP 2020) (pp. 574–579). Atlantis Press

  • Galbraith, C. S., Merrill, G. B., & Kline, D. M. (2012). Are student evaluations of teaching effectiveness valid for measuring student learning outcomes in business related classes? a neural network and bayesian analyses. Research in Higher Education, 53(3), 353–374.

    Article  Google Scholar 

  • Gaur, M., Faldu, K., & Sheth, A. (2021). Semantics of the black-box: Can knowledge graphs help make deep learning systems more interpretable and explainable? IEEE Internet Computing, 25(1), 51–59. https://doi.org/10.1109/MIC.2020.3031769

    Article  Google Scholar 

  • Guleria, P., & Sood, M. (2022). Explainable ai and machine learning: performance evaluation and explainability of classifiers on educational data mining inspired career counseling. Education and Information Technologies (pp. 1–36)

  • Gutierrez, C., & Sequeda, J. F. (2020). Knowledge graphs: A tutorial on the history of knowledge graph’s main ideas. In: Proceedings of the 29th ACM international conference on information & knowledge management (pp 3509–3510)

  • Ha, D. T., Loan, P. T. T., & Giap, C. N., et al. (2020). An empirical study for student academic performance prediction using machine learning techniques. International Journal of Computer Science and Information Security (IJCSIS), 18(3)

  • Hamilton, W. L. (2020). Graph representation learning. Synthesis Lectures on Artifical Intelligence and Machine Learning, 14(3), 1–159.

    Article  Google Scholar 

  • Hao, X., Ji, Z., Li, X., et al. (2021). Construction and application of a knowledge graph. Remote Sensing, 13(13), 2511.

    Article  Google Scholar 

  • Hasan, R., Palaniappan, S., Mahmood, S., et al. (2020). Predicting student performance in higher educational institutions using video learning analytics and data mining techniques. Applied Sciences, 10(11), 3894.

    Article  Google Scholar 

  • Hellas, A., Ihantola, P., & Petersen, A., et al. (2018). Predicting academic performance: a systematic literature review. In: Proceedings companion of the 23rd annual ACM conference on innovation and technology in computer science education (pp. 175–199)

  • Huang, C. L., & Huang, C. C. (2021). Study on customized knowledge graph of student pilot learning in fits training. Journal of Intelligent & Fuzzy Systems, 40(4), 7969–7979.

    Article  Google Scholar 

  • Hu, Y. H., Lo, C. L., & Shih, S. P. (2014). Developing early warning systems to predict students’ online learning performance. Computers in Human Behavior, 36, 469–478.

    Article  Google Scholar 

  • Iatrellis, O., Savvas, I. K., Fitsilis, P., et al. (2021). A two-phase machine learning approach for predicting student outcomes. Education and Information Technologies, 26(1), 69–88.

    Article  Google Scholar 

  • Iqbal, Z., Qadir, J., & Mian, A. N., et al. (2017). Machine learning based student grade prediction: A case study. arXiv preprint arXiv:1708.08744

  • Ji, G., He, S., & Xu, L., et al. (2015). Knowledge graph embedding via dynamic mapping matrix. In: Proceedings of the 53rd annual meeting of the association for computational linguistics and the 7th international joint conference on natural language processing (volume 1: Long papers), (pp. 687–696)

  • Jung, Y., & Lee, J. (2018). Learning engagement and persistence in massive open online courses (moocs). Computers & Education, 122, 9–22.

    Article  Google Scholar 

  • Karimi, H., Derr, T., & Huang, J., et al. (2020a). Online academic course performance prediction using relational graph convolutional neural network. International Educational Data Mining Society

  • Karimi, H., Derr, T., & Huang, J., et al. (2020b) Online academic course performance prediction using relational graph convolutional neural network. International Educational Data Mining Society

  • Kipf, T. N., & Welling, M. (2016). Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907

  • Kolo, D. K., & Adepoju, S. A. (2015). A decision tree approach for predicting students academic performance. International Journal of Education and Management Engineering

  • Kursuncu, U., Gaur, M., & Sheth, A. (2020). Knowledge infused learning (K-IL): Towards deep incorporation of knowledge in deep learning. Proceedings of the AAAI 2020 Spring Symposium on Combining Machine Learning and Knowledge Engineering in Practice (AAAI-MAKE).

  • Lau, E., Sun, L., & Yang, Q. (2019). Modelling, prediction and classification of student academic performance using artificial neural networks. SN Applied Sciences, 1(9), 1–10.

    Article  Google Scholar 

  • Li, K., Uvah, J., & Amin, R. (2020). A technology-enhanced smart learning environment based on the combination of knowledge graphs and learning paths. SCITEPRESS

  • Lin, Y., Liu, Z., & Sun, M., et al. (2015). Learning entity and relation embeddings for knowledge graph completion. In: Twenty-ninth AAAI conference on artificial intelligence

  • Lin, Q., Zhu, Y., Lu, H., et al. (2021). Improving university faculty evaluations via multi-view knowledge graph. Future Generation Computer Systems, 117, 181–192.

    Article  Google Scholar 

  • Listl, F. G., Fischer, J., & Beyer, D., et al. (2020). Knowledge representation in modeling and simulation: A survey for the production and logistic domain. In: 2020 25th IEEE International conference on emerging technologies and factory automation (ETFA) (pp 1051–1056). IEEE

  • Livieris, I. E., Kotsilieris, T., Tampakas, V., et al. (2019). Improving the evaluation process of students’ performance utilizing a decision support software. Neural Computing and Applications, 31(6), 1683–1694.

    Article  Google Scholar 

  • Lovelace, J., Newman-Griffis, D., & Vashishth, S., et al. (2021). Robust knowledge graph completion with stacked convolutions and a student re-ranking network. arXiv preprint arXiv:2106.06555

  • Márquez-Vera, C., Cano, A., Romero, C., et al. (2013). Predicting student failure at school using genetic programming and different data mining approaches with high dimensional and imbalanced data. Applied intelligence, 38(3), 315–330.

    Article  Google Scholar 

  • Meissner, R., & Köbis, L. (2020). Annotated knowledge graphs for teaching in higher education. In: International conference on web engineering (pp. 551–555). Springer

  • Moubayed, A., Injadat, M., Shami, A., et al. (2020). Student engagement level in an e-learning environment: Clustering using k-means. American Journal of Distance Education, 34(2), 137–156.

    Article  Google Scholar 

  • Mubarak, A. A., Cao, H., & Hezam, I. M., et al. (2022a). Modeling students’ performance using graph convolutional networks. Complex & Intelligent Systems (pp. 1–19)

  • Mubarak, A. A., Cao, H., & Hezam, I. M., et al. (2022b). Modeling students’ performance using graph convolutional networks. Complex & Intelligent Systems (pp. 1–19)

  • Nickel, M., Rosasco, L., & Poggio, T. (2016). Holographic embeddings of knowledge graphs. In: Proceedings of the AAAI conference on artificial intelligence

  • Niyogisubizo, J., Liao, L., Nziyumva, E., et al. (2022). Predicting student’s dropout in university classes using two-layer ensemble machine learning approach: A novel stacked generalization. Computers and Education: Artificial Intelligence, 3(100), 066.

    Google Scholar 

  • Osmanbegovic, E., & Suljic, M. (2012). Data mining approach for predicting student performance. Economic Review: Journal of Economics and Business, 10(1), 3–12.

    Google Scholar 

  • Paulheim, H. (2017). Knowledge graph refinement: A survey of approaches and evaluation methods. Semantic web, 8(3), 489–508.

    Article  Google Scholar 

  • Qi, S., Liu, L., & Kumar, B. S., et al. (2022). An english teaching quality evaluation model based on gaussian process machine learning. Expert Systems 39(6), e12,861

  • Rastrollo-Guerrero, J. L., Gómez-Pulido, J. A., & Durán-Dom’ınguez, A. (2020). Analyzing and predicting students’ performance by means of machine learning: A review. Applied sciences, 10(3), 1042.

    Article  Google Scholar 

  • Rizun, M. (2019). Knowledge graph application in education: A literature review. Folia Oeconomica

  • Rodr’ıguez-Hernández, C. F., Musso, M., Kyndt, E., et al. (2021). Artificial neural networks in academic performance prediction: Systematic implementation and predictor evaluation. Computers and Education: Artificial Intelligence, 2(100), 018.

    Google Scholar 

  • Stahr, M., Yu, X., Chen, H., et al. (2020). Design and implementation knowledge graph for curriculum system in university. EasyChair: Tech. rep.

    Google Scholar 

  • Stapel, M., Zheng, Z., & Pinkwart, N. (2016). An ensemble method to predict student performance in an online math learning environment. Journal of Educational Data Mining, 231–238

  • Sun, Y., Liang, J., & Niu, P. (2021). Generation of personalized knowledge graphs based on gcn. Journal of Computer and Communications, 9(9), 108–119.

    Article  Google Scholar 

  • Tinto, V. (1982). Limits of theory and practice in student attrition. The journal of higher education, 53(6), 687–700.

    Article  Google Scholar 

  • Tomasevic, N., Gvozdenovic, N., & Vranes, S. (2020). An overview and comparison of supervised data mining techniques for student exam performance prediction. Computers & education, 143(103), 676.

    Google Scholar 

  • Trouillon, T., Welbl, J., & Riedel, S., et al. (2016). Complex embeddings for simple link prediction. In: International conference on machine learning (pp. 2071–2080). PMLR.

  • Wang, Z., Zhang, J., & Feng, J., et al. (2014). Knowledge graph embedding by translating on hyperplanes. In: Proceedings of the AAAI conference on artificial intelligence

  • Wang, Q., Mao, Z., Wang, B., et al. (2017). Knowledge graph embedding: A survey of approaches and applications. IEEE Transactions on Knowledge and Data Engineering, 29(12), 2724–2743.

    Article  Google Scholar 

  • Wang, P.-W., Stepanova, D., Domokos, C., & Kolter, J. Z. (2020). Differentiable learning of numerical rules in knowledge graphs. In: Proceedings of the International Conference on Learning Representations (ICLR).

  • Wang, M., Qiu, L., & Wang, X. (2021). A survey on knowledge graph embeddings for link prediction. Symmetry, 13(3), 485.

    Article  Google Scholar 

  • Wang, J., & Zhang, W. (2020). Fuzzy mathematics and machine learning algorithms application in educational quality evaluation model. Journal of Intelligent & Fuzzy Systems, 39(4), 5583–5593.

    Article  Google Scholar 

  • Whitehill, J., Mohan, K., & Seaton, D., et al. (2017). Delving deeper into mooc student dropout prediction. arXiv preprint arXiv:1702.06404

  • Wu, Z., Pan, S., Chen, F., et al. (2020). A comprehensive survey on graph neural networks. IEEE Transactions on Neural Networks and Learning Systems, 32(1), 4–24.

    Article  MathSciNet  Google Scholar 

  • Yu, X., Stahr, M., & Chen, H., et al. (2021). Design and implementation of curriculum system based on knowledge graph. In: 2021 IEEE International Conference on Consumer Electronics and Computer Engineering (ICCECE) (pp 767–770). IEEE

  • Zaki, N., Mohamed, E., & Habuza, T. (2021a). From tabulated data to knowledge graph: A novel way of improving the performance of the classification models in the healthcare data. medRxiv preprint

  • Zaki, N., Mohamed, E. A., & Habuza, T. (2021b) From tabulated data to knowledge graph: A novel way of improving the performance of the classification models in the healthcare data. medRxiv

  • Zha, Z. J., Mei, T., Wang, J., et al. (2009). Graph-based semi-supervised learning with multiple labels. Journal of Visual Communication and Image Representation, 20(2), 97–103.

    Article  Google Scholar 

  • Zhang, H., Sun, M., & Wang, X., et al. (2017). Smart jump: Automated navigation suggestion for videos in moocs. In: Proceedings of the 26th international conference on world wide web companion (pp. 331–339)

  • Zhao, Q., Li, Q., & Wen, J. (2018). Construction and application research of knowledge graph in aviation risk field. In: MATEC Web of Conferences, EDP Sciences (pp. 05003)

  • Zhen, Y., Zheng, L., & Chen, P. (2021). Constructing knowledge graphs for online collaborative programming. IEEE Access 9, 117,969–117,980

  • Zohair, A., & Mahmoud, L. (2019). Prediction of student’s performance by modelling small dataset size. International Journal of Educational Technology in Higher Education, 16(1), 1–18.

    Google Scholar 

  • Zwaneveld, B. (2014). Structuring mathematical knowledge and skills by means of knowledge graphs. International Journal of Mathematical Education in Science and Technology, 31(3), 393–414.

    Article  Google Scholar 

Download references

Funding

Not applicable.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Balqis Albreiki.

Ethics declarations

Competing interests

The authors declare that they have no competing interests.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Albreiki, B., Habuza, T., Palakkal, N. et al. Clustering-based knowledge graphs and entity-relation representation improves the detection of at risk students. Educ Inf Technol 29, 6791–6820 (2024). https://doi.org/10.1007/s10639-023-11938-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10639-023-11938-8

Keywords

Navigation