skip to main content
10.1145/2695664.2695918acmconferencesArticle/Chapter ViewAbstractPublication PagessacConference Proceedingsconference-collections
research-article

Towards automatic prediction of student performance in STEM undergraduate degree programs

Published: 13 April 2015 Publication History

Abstract

STEM is defined as learning in the fields of Science, Technology, Engineering and Mathematics. In Brazil, many students leave the educational system before achieving a tertiary degree in these fields. Poor academic performance in STEM undergraduate courses is an issue faced by many universities, both in developed and emerging countries. Although these universities store large amounts of data, there are few studies about educational data mining (EDM) software tools designed to aid educational managers in analyzing student learning and improving the quality of undergraduate degree programs. Our approach may assist managers in supervising students at the end of each academic term, thus enabling them to identify the students in difficulty of fulfilling the academic requirements toward a degree. This paper shows quantitative experimental studies using a large dataset of real data from five traditional STEM undergraduate courses of one of the largest public Brazilian universities. Finally, the results show that data mining algorithms can establish effective prediction models from existing student data.

References

[1]
AFT. Student Persistence in College: More than counting caps and gowns. American Federation of Teachers (AFT) Higher Education, 2003.
[2]
Ashby, A. Monitoring student retention in the Open University: definition, measurement, interpretation and action. Open Learning: The Journal of Open, Distance and e-Learning, 19, 1 (2004), 65--77.
[3]
Baker, R. S. J. D. and Yacef, K. The State of Educational Data Mining in 2009: A Review and Future Visions. Journal of Educational Data Mining, 1, 1 (2009), 3--17.
[4]
Baker, R., Isotani, S. and Carvalho, A. Mineração de Dados Educacionais: Oportunidades para o Brasil. Revista Brasileira de Informática na Educação, 19, 2 (2011), 3--13.
[5]
Dekker, G., Pechenizkiy, M., and Vleeshouwers, J. Predicting Students Drop out: A Case Study. In Proceedings of the International Conference on Educational Data Mining. (Cordoba, Spain, 2009), 41--50.
[6]
Hall, M., et al. The WEKA data mining software: an update. ACM SIGKDD explorations newsletter, 11, 1 (2009), 10--18.
[7]
Hamalainen, W. and Vinni, M. Comparison of Machine Learning Methods for Intelligent Tutoring Systems. In Proc. Int. Conf. Intell. Tutoring Syst., (Taiwan, 2006), 525--534.
[8]
Huang, S. Predictive Modeling and Analysis of Student Academic Performance in an Engineering Dynamics Course. Ph.D. Thesis, Utah State Univ., Logan, Utah, 2011.
[9]
INEP. Censo da Educação Superior 2013. Instituto Nacional de Estudos e Pesquisas Educacionais Anísio Teixeira. Technical Report MEC, 2013.
[10]
Kotsiantis, S. B., Pierrakeas, C. J. and Pintelas, P. E. Preventing student dropout in distance learning using machine learning techniques. Knowledge-Based Intelligent Information and Engineering Systems. Springer Berlin Heidelberg, 2003.
[11]
Kotsiantis, S. B., Zaharakis, I. D. and Pintelas, P. E. Supervised machine learning: A review of classification techniques, (2007), 3--24.
[12]
Lobo, M. B. C. M. Panorama da Evasão no Ensino Superior Brasileiro: Aspectos Gerais das Causas e Soluções. Instituto Lobo & Assoc. Consultoria. 2011.
[13]
Lykourentzou, I. et al. Dropout Prediction in E-Learning Courses through the Combination of Machine Learning Tech. Computers & Education, 53, 3 (2009), 950--965.
[14]
Manhães, L. M. B., Cruz, S. M. S. and Zimbrão, G. The Impact of High Dropout Rates in a Large Public Federal Brazilian University: A Quantitative Approach Using EDM. In Proc. of the 6th Inter. Conf. on Computer Supported Education (CSEDU '14) (Barcelona, Spain, 2014), 124--129.
[15]
Manhães, L. M. B., Cruz, S. M. S. and Zimbrão, G. Evaluating Performance and Dropouts of Undergraduates using EDM. In Proc. of the 20th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD '14) in the Data Mining for Educational Assessment and Feedback workshop (ASSESS 2014) (New York, NY, 2014).
[16]
Manhães, L. M. B., Cruz, S. M. S. and Zimbrão, G. WAVE: an Architecture for Predicting Dropout in Undergraduate Courses using EDM. In Proc. of the Symposium of Applied Computing (ILLE SAC 2014) (Gyeongju, Korea), 2014.
[17]
OECD. Education at a Glance 2014: OECD Indicators. OECD publishing, 2014.
[18]
Pal, S. Mining Educational Data to Reduce Dropout Rates of Engineering Students. Inter. Journal of Info. Engineering and Electronic Business (IJIEEB), 4(2), 1, 2012.
[19]
Romero, C. and Ventura, S. Educational Data Mining: A Review of the State of the Art. IEEE Trans. on Syst., Man, and Cyb., Part C: App. and Reviews, 40, 6 (2010), 601--618.
[20]
Romero, C., and Ventura, S. Data Mining in Education. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, In Press, 3, 1, 2013, 12--27.
[21]
San Pedro, M. O. Z. et al. Predicting STEM and Non-STEM College Major Enrollment from Middle School Interaction with Mathematics Educational Software. In Proc. of The 7th International Conference on EDM. 276--279, 2014.
[22]
Silva Filho, R. L. L., et al. A Evasão no Ensino Superior Brasileiro. Fund. Carlos Chagas, 37, 132 (2007), 641--659.
[23]
Superby, J. F. et al. Determination of factors influencing the achievement of the first-year university students using data mining methods. In Proc. Inter. Conf. Intelligent Tutoring Syst. of the Workshop on EDM, (Taiwan, 2006), 1--8.
[24]
Witten, I. H. and Frank, E. Data Mining: Practical machine learning tools and techniques. 2nd edition Morgan Kaufmann, San Francisco, 2005.
[25]
Wu, X., et al. Top 10 algorithms in data mining. Journal of Knowl. and Info. Sys., Springer London, 14, 1 (2008), 1--37.

Cited By

View all
  • (2024)Systematic Review and Analysis of EDM for Predicting the Academic Performance of StudentsJournal of The Institution of Engineers (India): Series B10.1007/s40031-024-00998-0105:4(1021-1071)Online publication date: 4-Feb-2024
  • (2024)Student grade prediction for effective learning approaches using the optimized ensemble deep neural networkEducation and Information Technologies10.1007/s10639-024-13224-7Online publication date: 16-Dec-2024
  • (2023)Analyzing feature importance for a predictive undergraduate student dropout modelComputer Science and Information Systems10.2298/CSIS211110050J20:1(175-194)Online publication date: 2023
  • Show More Cited By

Index Terms

  1. Towards automatic prediction of student performance in STEM undergraduate degree programs

        Recommendations

        Comments

        Information & Contributors

        Information

        Published In

        cover image ACM Conferences
        SAC '15: Proceedings of the 30th Annual ACM Symposium on Applied Computing
        April 2015
        2418 pages
        ISBN:9781450331968
        DOI:10.1145/2695664
        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Sponsors

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        Published: 13 April 2015

        Permissions

        Request permissions for this article.

        Check for updates

        Author Tags

        1. STEM
        2. algorithm
        3. educational data mining (EDM)
        4. prediction model

        Qualifiers

        • Research-article

        Conference

        SAC 2015
        Sponsor:
        SAC 2015: Symposium on Applied Computing
        April 13 - 17, 2015
        Salamanca, Spain

        Acceptance Rates

        SAC '15 Paper Acceptance Rate 291 of 1,211 submissions, 24%;
        Overall Acceptance Rate 1,650 of 6,669 submissions, 25%

        Upcoming Conference

        SAC '25
        The 40th ACM/SIGAPP Symposium on Applied Computing
        March 31 - April 4, 2025
        Catania , Italy

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • Downloads (Last 12 months)13
        • Downloads (Last 6 weeks)0
        Reflects downloads up to 20 Jan 2025

        Other Metrics

        Citations

        Cited By

        View all
        • (2024)Systematic Review and Analysis of EDM for Predicting the Academic Performance of StudentsJournal of The Institution of Engineers (India): Series B10.1007/s40031-024-00998-0105:4(1021-1071)Online publication date: 4-Feb-2024
        • (2024)Student grade prediction for effective learning approaches using the optimized ensemble deep neural networkEducation and Information Technologies10.1007/s10639-024-13224-7Online publication date: 16-Dec-2024
        • (2023)Analyzing feature importance for a predictive undergraduate student dropout modelComputer Science and Information Systems10.2298/CSIS211110050J20:1(175-194)Online publication date: 2023
        • (2023)Extracting topological features to identify at-risk students using machine learning and graph convolutional network modelsInternational Journal of Educational Technology in Higher Education10.1186/s41239-023-00389-320:1Online publication date: 10-Apr-2023
        • (2023)Student Retention Factors in a Teacher-Training Course2023 15th International Congress on Advanced Applied Informatics Winter (IIAI-AAI-Winter)10.1109/IIAI-AAI-Winter61682.2023.00027(101-104)Online publication date: 11-Dec-2023
        • (2023)Staying Ahead of the Curve: Early Prediction of Academic Probation among First-Year CS Students2023 3rd International Conference on Applied Artificial Intelligence (ICAPAI)10.1109/ICAPAI58366.2023.10194020(1-7)Online publication date: 2-May-2023
        • (2023)Classification Technique and its Combination with Clustering and Association Rule Mining in Educational Data Mining — A surveyEngineering Applications of Artificial Intelligence10.1016/j.engappai.2023.106071122(106071)Online publication date: Jun-2023
        • (2023)Learners’ Performance Evaluation Using Genetic AlgorithmsAdvances on Intelligent Computing and Data Science10.1007/978-3-031-36258-3_8(88-99)Online publication date: 17-Aug-2023
        • (2020)Predicting Student Performance and Its Influential Factors Using Hybrid Regression and Multi-Label ClassificationIEEE Access10.1109/ACCESS.2020.30365728(203827-203844)Online publication date: 2020
        • (2018)Predicting academic performance: a systematic literature reviewProceedings Companion of the 23rd Annual ACM Conference on Innovation and Technology in Computer Science Education10.1145/3293881.3295783(175-199)Online publication date: 2-Jul-2018

        View Options

        Login options

        View options

        PDF

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        Media

        Figures

        Other

        Tables

        Share

        Share

        Share this Publication link

        Share on social media