Skip to main content

A Novel Initial Population Construction Heuristic for the DINOS Subgroup Discovery Algorithm

  • Conference paper
  • First Online:
  • 629 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 13055))

Abstract

Evolutionary algorithms for subgroup discovery usually randomly initialize the population, which often causes them to spend part of their time evaluating unpromising solutions. This situation causes the algorithm to take more time to converge to good solutions. In this paper, we present a new initial population construction heuristic for DINOS, a genetic subgroup discovery algorithm that mines non-redundant subgroups with high quality in a short time. The proposed heuristic is based on the generation of a collection of decision trees, allowing to obtain an initial population in which all the rules are valid and with a large coverage of the database. The quality of these rules is also high and they contain a large diversity in the attributes used, allowing to deal with problems having a large number of dimensions. The experiments carried out show that the new method allows mining more high-quality and diverse subgroups in a slightly higher computational time.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   69.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   89.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. de Albuquerque Torreao, V., Vimieiro, R.: Effects of population initialization on evolutionary techniques for subgroup discovery in high dimensional datasets. In: 2018 7th Brazilian Conference on Intelligent Systems (BRACIS), pp. 25–30 (2018)

    Google Scholar 

  2. Carmona, C., del Jesus, M., Herrera, F.: A unifying analysis for the supervised descriptive rule discovery via the weighted relative accuracy. Knowl.-Based Syst. 139, 89–100 (2018)

    Google Scholar 

  3. Carmona, C.J., González, P., Del Jesus, M.J., Herrera, F.: NMEEF-SD: non-dominated multiobjective evolutionary algorithm for extracting fuzzy rules in subgroup discovery. IEEE Trans. Fuzzy Syst. 18, 958–970 (2010)

    Google Scholar 

  4. Carmona, C.J., González, P., Del Jesus, M.J., Romero, C., Ventura, S.: Evolutionary algorithms for subgroup discovery applied to e-learning data. In: 2010 IEEE Education Engineering Conference, EDUCON 2010, pp. 983–990 (2010)

    Google Scholar 

  5. Carmona, C.J., González, P., del Jesús, M.J.: FuGePSD: fuzzy genetic programming-based algorithm for subgroup discovery. In: Proceedings of the 2015 Conference of the International Fuzzy Systems Association and the European Society for Fuzzy Logic and Technology, pp. 447–454. Atlantis Press (2015/2016)

    Google Scholar 

  6. Carmona, C.J., González, P., del Jesus, M.J., Herrera, F.: Overview on evolutionary subgroup discovery: analysis of the suitability and potential of the search performed by evolutionary algorithms. Wiley Interdiscip. Rev.: Data Min. Knowl. Discov. 4(2), 87–103 (2014)

    Google Scholar 

  7. Carmona, C.J., González, P., del Jesus, M.J., Navío-Acosta, M., Jiménez-Trevino, L.: Evolutionary fuzzy rule extraction for subgroup discovery in a psychiatric emergency department. Soft Comput. 15(12), 2435–2448 (2011)

    Google Scholar 

  8. De Jong, K.: Evolutionary computation: a unified approach. In: Proceedings of the 2016 on Genetic and Evolutionary Computation Conference Companion, GECCO 2016 Companion, pp. 185–199. Association for Computing Machinery, New York (2016)

    Google Scholar 

  9. Del Jesus, M.J., Gonzílez, P., Herrera, F.: Multiobjective genetic algorithm for extracting subgroup discovery fuzzy rules. In: Proceedings of the 2007 IEEE Symposium on Computational Intelligence in Multicriteria Decision Making, MCDM 2007, pp. 50–57 (2007)

    Google Scholar 

  10. Dheeru, D., Karra Taniskidou, E.: UCI machine learning repository (2017)

    Google Scholar 

  11. García, S., Fernández, A., Luengo, J., Herrera, F.: Advanced nonparametric tests for multiple comparisons in the design of experiments in computational intelligence and data mining: experimental analysis of power. Inf. Sci. 180(10), 2044–2064 (2010). Special Issue on Intelligent Distributed Information Systems

    Google Scholar 

  12. García-Borroto, M., Loyola-González, O., Martínez-Trinidad, J.F., Carrasco-Ochoa, J.A.: Evaluation of quality measures for contrast patterns by using unseen objects. Expert Syst. Appl. 83, 104–113 (2017)

    Google Scholar 

  13. García-Vico, A., Carmona, C., Martín, D., García-Borroto, M., del Jesus, M.: An overview of emerging pattern mining in supervised descriptive rule discovery: taxonomy, empirical study, trends, and prospects. WIREs Data Min. Knowl. Discov. 8(1), e1231 (2018)

    Google Scholar 

  14. Bravo Ilisástigui, L., Martín Rodríguez D., García-Borroto M.: A new method to evaluate subgroup discovery algorithms. In: Nyström I., Hernández Heredia, Y., Milián Núñez, V. (eds.) CIARP 2019. LNCS, vol. 11896, pp. 417–426. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-33904-3_39

  15. Ilisástigui, L.B., Rodríguez, D.M., García-Borroto, M.: A new method for non-redundant subgroup discovery (in Spanish). Revista Cubana de Ciencias Informáticas 14, 18–40 (2020)

    Google Scholar 

  16. Luna, J.M., Carmona, C.J., García-Vico, A., del Jesus, M.J., Ventura, S.: Subgroup discovery on multiple instance data. Int. J. Comput. Intell. Syst. 12(2), 1602–1612 (2019)

    Google Scholar 

  17. Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers Inc., San Francisco (1993)

    Google Scholar 

  18. Quinlan, J.R.: Bagging, Boosting, and C4.5, vol. 1, pp. 725–730. AAAI Press (1996)

    Google Scholar 

  19. Sáez, C., Romero, N., Conejero, J.A., García-Gómez, J.M.: Potential limitations in COVID-19 machine learning due to data source variability: a case study in the nCov2019 dataset. J. Am. Med. Inform. Assoc. 28(2), 360–364 (2020)

    Google Scholar 

  20. Talbi, E.G.: Metaheuristics: From Design to Implementation, vol. 74. Wiley, Hoboken (2009)

    Google Scholar 

  21. Valmarska, A., Cabrera-Diego, L.A., Linhares Pontes, E., Pollak, S.: Exploratory analysis of news sentiment using subgroup discovery. In: Proceedings of the 8th Workshop on Balto-Slavic Natural Language Processing, pp. 66–72. Association for Computational Linguistics, Kiyv (2021)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Milton García-Borroto .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Bravo-Ilisástigui, L., Reyes-Morales, L., Martín, D., García-Borroto, M. (2021). A Novel Initial Population Construction Heuristic for the DINOS Subgroup Discovery Algorithm. In: Hernández Heredia, Y., Milián Núñez, V., Ruiz Shulcloper, J. (eds) Progress in Artificial Intelligence and Pattern Recognition. IWAIPR 2021. Lecture Notes in Computer Science(), vol 13055. Springer, Cham. https://doi.org/10.1007/978-3-030-89691-1_27

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-89691-1_27

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-89690-4

  • Online ISBN: 978-3-030-89691-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics