Abstract
In many college courses in several countries are used exams in a national scale, such as Gaokao, in China, Scholastic Aptitude Test - SAT and the American College Testing - ACT in the United States of American, Yüksekögretime Gec̣is Sinavi – YGS in Turkey, among others. This paper examines microdata from the High School National Examination (ENEM) database from Brazil. The database has 8,627,367 records, 166 attributes, and all experiments were performed based on the Spark architecture. The objective of this work is to examine microdata of the ENEM database applying data mining algorithms and creating an approach to handle big data and to predict the profile of those enrolled in ENEM. Through the standards found by the data mining algorithms with classification algorithms, it was possible to observe that family income, access to information, profession, and academic history of the parents were directly related to the performance of the candidates. And with a rules induction algorithm, it was possible to identify the patterns presented in each of the regions of Brazil, such as common characteristics when a candidate was approved and when not, essential factors as disciplines and particular characteristics of each region. This approach also enables the execution of large volumes of data in a simplified computational structure.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Almeida, A.T.C., Ramalho, H.M., Araujo Junior, I.T.: Managerial effort under asymmetric information: the case of public schools in Brazil. EconomiA 18(3), 275–297 (2017)
Anderson, K., Gong, X., Hong, K., Zhang, X.: Do selective high schools improve student achievement? Effects of exam schools in China. China Econ. Rev. 40, 121–134 (2016)
Beemer, J., Spoon, K., He, L., Fan, J., Levine, R.A.: Ensemble learning for estimating individualized treatment effects in student success studies. Int. J. Artif. Intell. Educ. 28(3), 315–335 (2018)
DuBois, P., Hinz, S., Pedersen, C.: MySQL 5.0 Certification Study Guide (MySQL Press). MySQL Press (2005)
Foote, A., Schulkind, L., Shapiro, T.M.: Missed signals: the effect of ACT college-readiness measures on post-secondary decisions. Econ. Educ. Rev. 46, 39–51 (2015)
Gounaris, A., Torres, J.: A methodology for spark parameter tuning. Big Data Res. 11, 22–32 (2018)
Hafalir, I.E., Hakimov, R., Kübler, D., Kurino, M.: College admissions with entrance exams: centralized versus decentralized. J. Econ. Theor. 176(15), 886–934 (2018)
Hatipoglu, Ç.: The impact of the university entrance exam on EFL education in Turkey: pre-service English language teachers’ perspective. Procedia - Soc. Behav. Sci. 232, 136–144 (2016)
INEP. Exame Nacional Do Ensino Médio - Enem (2018)
Janning, R., Schatten, C., Schmidt-Thieme, L.: Perceived task-difficulty recognition from log-file information for the use in adaptive intelligent tutoring systems. Int. J. Artif. Intell. Educ. 26(3), 855–876 (2016)
Kim, Y.: The effects of school choice on achievement gaps between private and public high schools: evidence from the Seoul high school choice program. Int. J. Educ. Dev. 60, 25–32 (2018)
Maillo, J., Triguero, I., Herrera, F.: A mapreduce-based k-nearest neighbor approach for big data classification. In: IEEE Trustcom/BigDataSE/ISPA, Helsinki, vol. 2, pp. 167–172 (2015)
Meng, X., et al.: MLlib: machine learning in apache spark. J. Mach. Learn. Res. 17(1), 1235–1241 (2016)
Pereira, C.A., Araujo, J.F.F.E., de Lourdes Machado-Taylor, M.: The Brazilian higher education evaluation model: “SINAES” sui generis? Int. J. Educ. Dev. 61, 5–15 (2018)
Polat, K., GÜneŞ, S.: Breast cancer diagnosis using least square support vector machine. Digit. Sig. Proc. 17(4), 694–701 (2007)
Valdés Aguirre, B., Ramírez Uresti, J.A., du Boulay, B.: An analysis of student model portability. Int. J. Artif. Intell. Educ. 26(3), 932–974 (2016)
Viggiano, E., Mattos, C.: O desempenho de estudantes no Enem 2010 em diferentes regiões brasileiras. Rev. Bras. de Estudos Pedagógicos 94(237), 417–438 (2013)
Yoo, J., Kim, J.: Can online discussion participation predict group project performance? Investigating the roles of linguistic features and participation patterns. Int. J. Artif. Intell. Educ. 24(1), 8–32 (2014)
Yue, C.: Expansion and equality in Chinese higher education. Int. J. Educ. Dev. 40, 50–58 (2015)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
de Castro Rodrigues, D., Dias de Lima, M., da Conceição, M.D., de Siqueira, V.S., M. Barbosa, R. (2019). A Data Mining Approach Applied to the High School National Examination: Analysis of Aspects of Candidates to Brazilian Universities. In: Moura Oliveira, P., Novais, P., Reis, L. (eds) Progress in Artificial Intelligence. EPIA 2019. Lecture Notes in Computer Science(), vol 11804. Springer, Cham. https://doi.org/10.1007/978-3-030-30241-2_1
Download citation
DOI: https://doi.org/10.1007/978-3-030-30241-2_1
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-30240-5
Online ISBN: 978-3-030-30241-2
eBook Packages: Computer ScienceComputer Science (R0)