Abstract
Evolutionary algorithms for subgroup discovery usually randomly initialize the population, which often causes them to spend part of their time evaluating unpromising solutions. This situation causes the algorithm to take more time to converge to good solutions. In this paper, we present a new initial population construction heuristic for DINOS, a genetic subgroup discovery algorithm that mines non-redundant subgroups with high quality in a short time. The proposed heuristic is based on the generation of a collection of decision trees, allowing to obtain an initial population in which all the rules are valid and with a large coverage of the database. The quality of these rules is also high and they contain a large diversity in the attributes used, allowing to deal with problems having a large number of dimensions. The experiments carried out show that the new method allows mining more high-quality and diverse subgroups in a slightly higher computational time.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
de Albuquerque Torreao, V., Vimieiro, R.: Effects of population initialization on evolutionary techniques for subgroup discovery in high dimensional datasets. In: 2018 7th Brazilian Conference on Intelligent Systems (BRACIS), pp. 25–30 (2018)
Carmona, C., del Jesus, M., Herrera, F.: A unifying analysis for the supervised descriptive rule discovery via the weighted relative accuracy. Knowl.-Based Syst. 139, 89–100 (2018)
Carmona, C.J., González, P., Del Jesus, M.J., Herrera, F.: NMEEF-SD: non-dominated multiobjective evolutionary algorithm for extracting fuzzy rules in subgroup discovery. IEEE Trans. Fuzzy Syst. 18, 958–970 (2010)
Carmona, C.J., González, P., Del Jesus, M.J., Romero, C., Ventura, S.: Evolutionary algorithms for subgroup discovery applied to e-learning data. In: 2010 IEEE Education Engineering Conference, EDUCON 2010, pp. 983–990 (2010)
Carmona, C.J., González, P., del Jesús, M.J.: FuGePSD: fuzzy genetic programming-based algorithm for subgroup discovery. In: Proceedings of the 2015 Conference of the International Fuzzy Systems Association and the European Society for Fuzzy Logic and Technology, pp. 447–454. Atlantis Press (2015/2016)
Carmona, C.J., González, P., del Jesus, M.J., Herrera, F.: Overview on evolutionary subgroup discovery: analysis of the suitability and potential of the search performed by evolutionary algorithms. Wiley Interdiscip. Rev.: Data Min. Knowl. Discov. 4(2), 87–103 (2014)
Carmona, C.J., González, P., del Jesus, M.J., Navío-Acosta, M., Jiménez-Trevino, L.: Evolutionary fuzzy rule extraction for subgroup discovery in a psychiatric emergency department. Soft Comput. 15(12), 2435–2448 (2011)
De Jong, K.: Evolutionary computation: a unified approach. In: Proceedings of the 2016 on Genetic and Evolutionary Computation Conference Companion, GECCO 2016 Companion, pp. 185–199. Association for Computing Machinery, New York (2016)
Del Jesus, M.J., Gonzílez, P., Herrera, F.: Multiobjective genetic algorithm for extracting subgroup discovery fuzzy rules. In: Proceedings of the 2007 IEEE Symposium on Computational Intelligence in Multicriteria Decision Making, MCDM 2007, pp. 50–57 (2007)
Dheeru, D., Karra Taniskidou, E.: UCI machine learning repository (2017)
García, S., Fernández, A., Luengo, J., Herrera, F.: Advanced nonparametric tests for multiple comparisons in the design of experiments in computational intelligence and data mining: experimental analysis of power. Inf. Sci. 180(10), 2044–2064 (2010). Special Issue on Intelligent Distributed Information Systems
García-Borroto, M., Loyola-González, O., Martínez-Trinidad, J.F., Carrasco-Ochoa, J.A.: Evaluation of quality measures for contrast patterns by using unseen objects. Expert Syst. Appl. 83, 104–113 (2017)
García-Vico, A., Carmona, C., Martín, D., García-Borroto, M., del Jesus, M.: An overview of emerging pattern mining in supervised descriptive rule discovery: taxonomy, empirical study, trends, and prospects. WIREs Data Min. Knowl. Discov. 8(1), e1231 (2018)
Bravo Ilisástigui, L., Martín Rodríguez D., García-Borroto M.: A new method to evaluate subgroup discovery algorithms. In: Nyström I., Hernández Heredia, Y., Milián Núñez, V. (eds.) CIARP 2019. LNCS, vol. 11896, pp. 417–426. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-33904-3_39
Ilisástigui, L.B., Rodríguez, D.M., García-Borroto, M.: A new method for non-redundant subgroup discovery (in Spanish). Revista Cubana de Ciencias Informáticas 14, 18–40 (2020)
Luna, J.M., Carmona, C.J., García-Vico, A., del Jesus, M.J., Ventura, S.: Subgroup discovery on multiple instance data. Int. J. Comput. Intell. Syst. 12(2), 1602–1612 (2019)
Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers Inc., San Francisco (1993)
Quinlan, J.R.: Bagging, Boosting, and C4.5, vol. 1, pp. 725–730. AAAI Press (1996)
Sáez, C., Romero, N., Conejero, J.A., García-Gómez, J.M.: Potential limitations in COVID-19 machine learning due to data source variability: a case study in the nCov2019 dataset. J. Am. Med. Inform. Assoc. 28(2), 360–364 (2020)
Talbi, E.G.: Metaheuristics: From Design to Implementation, vol. 74. Wiley, Hoboken (2009)
Valmarska, A., Cabrera-Diego, L.A., Linhares Pontes, E., Pollak, S.: Exploratory analysis of news sentiment using subgroup discovery. In: Proceedings of the 8th Workshop on Balto-Slavic Natural Language Processing, pp. 66–72. Association for Computational Linguistics, Kiyv (2021)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Bravo-Ilisástigui, L., Reyes-Morales, L., Martín, D., García-Borroto, M. (2021). A Novel Initial Population Construction Heuristic for the DINOS Subgroup Discovery Algorithm. In: Hernández Heredia, Y., Milián Núñez, V., Ruiz Shulcloper, J. (eds) Progress in Artificial Intelligence and Pattern Recognition. IWAIPR 2021. Lecture Notes in Computer Science(), vol 13055. Springer, Cham. https://doi.org/10.1007/978-3-030-89691-1_27
Download citation
DOI: https://doi.org/10.1007/978-3-030-89691-1_27
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-89690-4
Online ISBN: 978-3-030-89691-1
eBook Packages: Computer ScienceComputer Science (R0)