Skip to main content

The Issue of Efficient Generation of Generalized Features in Algorithmic Classification Tree Methods

  • Conference paper
  • First Online:
Data Stream Mining & Processing (DSMP 2020)

Abstract

The paper studies the basic issue of methods of constructing models of algorithmic classification trees which is the problem of generating generalized features in their structure. There has been suggested a simple and efficient method of approximating the initial training dataset with the help of a set of generalized features of the type of geometric objects in feature space of the current application problem. This approach allows ensuring the necessary accuracy of classification trees, reducing the structural (constructional) complexity and achieving the appropriate performance indicators of the model. Based on the proposed algorithmic scheme of the training set coverage there has been developed the corresponding software which enables the work with a set of different-type applied problems of data processing. Hence, a simple, efficient and economical approximation of the initial training set provides the appropriate speed, level of complexity of the classification scheme, which ensures the simple and complete recognition (coverage) of sets of discrete objects.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Alpaydin, E.: Introduction to Machine Learning. The MIT Press, Cambridge (2010)

    MATH  Google Scholar 

  2. Amit, Y., Wilder, K.: Joint induction of shape features and tree classifiers. IEEE Trans. Pattern Anal. Mach. Intell. 19(11), 1300–1305 (1997). https://doi.org/10.1109/34.632990

    Article  Google Scholar 

  3. Bodyanskiy, Y., Vynokurova, O., Setlak, G., Pliss, I.: Hybrid neuro-neo-fuzzy system and its adaptive learning algorithm. In: 2015 Xth International Scientific and Technical Conference “Computer Sciences and Information Technologies” (CSIT), pp. 111–114 (2015)

    Google Scholar 

  4. Breiman, L., Friedman, J., Olshen, R., Stone, C.: (1984)

    Google Scholar 

  5. Deng, Houtao, Runger, George, Tuv, Eugene: Bias of importance measures for multi-valued attributes and solutions. In: Honkela, Timo, Duch, Włodzisław, Girolami, Mark, Kaski, Samuel (eds.) ICANN 2011. LNCS, vol. 6792, pp. 293–300. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-21738-8_38

    Chapter  Google Scholar 

  6. Hastie, Trevor, Tibshirani, Robert, Friedman, Jerome: The Elements of Statistical Learning. SSS. Springer, New York (2009). https://doi.org/10.1007/978-0-387-84858-7

    Book  MATH  Google Scholar 

  7. Hyunjoong, K., Wei-Yin, L.: Classification trees with unbiased multiway splits. J. Am. Stat. Assoc. 96(454), 589–604 (2001). https://doi.org/10.1198/016214501753168271

    Article  MathSciNet  Google Scholar 

  8. Kaminski, B., Jakubczyk, M., Szufel, P.: A framework for sensitivity analysis of decision trees. Cent. Eur. J. Oper. Res. 26(1), 135–159 (2018). https://doi.org/10.1007/s10100-017-0479-6

    Article  MathSciNet  MATH  Google Scholar 

  9. Karimi, K., Hamilton, H.: Generation and interpretation of temporal decision rules. Int. J. Comput. Inf. Syst. Ind. Manag. Appl. 3, 314–323 (2011)

    Google Scholar 

  10. Koskimaki, H., Juutilainen, I., Laurinen, P., Roning, J.: Two-level clustering approach to training data instance selection: a case study for the steel industry. In: 2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence), pp. 3044–3049 (2008)

    Google Scholar 

  11. Kotsiantis, S.: Supervised machine learning: a review of classification techniques. Informatica. Int. J. Comput. Inform. 31(3), 249–268 (2007)

    MathSciNet  MATH  Google Scholar 

  12. Laver, V., Povkhan, I.: The algorithms for constructing a logical tree of classification in pattern recognition problems. In: Scientific Notes of the Tauride National University. Series: Technical Sciences 4, pp. 100–106 (2019)

    Google Scholar 

  13. Lupei, M., Mitsa, A., Repariuk, V., Sharkan, V.: Identification of authorship of Ukrainian-language texts of journalistic style using neural networks. East.-Eur. J. Enterp. Technol. 1, 30–36 (2020). https://doi.org/10.15587/1729-4061.2020.195041

    Article  Google Scholar 

  14. Lopez de Mantaras, R.: A distance-based attribute selection measure for decision tree induction. Mach. Learn. 6, 81–92 (1991). https://doi.org/10.1023/A:1022694001379

  15. Mingers, J.: An empirical comparison of selection measures for decision-tree induction. Mach. Learn. 3(4), 319–342 (1989). https://doi.org/10.1007/BF00116837

    Article  Google Scholar 

  16. Miyakawa, M.: Criteria for selecting a variable in the construction of efficient decision trees. Trans. Comput. 38(1), 130–141 (1989). https://doi.org/10.1109/12.8736

    Article  MathSciNet  MATH  Google Scholar 

  17. Painsky, A., Rosset, S.: Cross-validated variable selection in tree-based methods improves predictive performance. IEEE Trans. Pattern Anal. Mach. Intell. 39, 2142–2153 (2017)

    Article  Google Scholar 

  18. Povhan, I.: Designing of recognition system of discrete objects. In: IEEE First International Conference on Data Stream Mining and Processing (DSMP), Lviv, Ukraine, pp. 226—231 (2016)

    Google Scholar 

  19. Povhan, I.: General scheme for constructing the most complex logical tree of classification in pattern recognition discrete objects. In: Electronics and information technologies. Collection of Scientific Papers, vol. 11, pp. 112–117. Ivan Franko National University of Lviv (2019). https://doi.org/10.30970/eli.11.7

  20. Povkhan, I.: Features of synthesis of generalized features in the construction of recognition systems using the logical tree method. In: Proceeding of the International Scientific and Practical Conference “Information Technologies and Computer Modeling ITKM-2019”, pp. 169–174 (2019)

    Google Scholar 

  21. Quinlan, J.: Induction of decision trees. Mach. Learn. 1(1), 81–106 (1986). https://doi.org/10.1023/A:1022643204877

    Article  Google Scholar 

  22. Srikant, R., Agrawal, R.: Mining generalized association rules. In: Proceedings of the 21th International Conference on Very Large Data Bases, pp. 407—419. Morgan Kaufmann Publishers Inc. (1995)

    Google Scholar 

  23. Strobl, C., Boulesteix, A.L., Augustin, T.: Unbiased split selection for classification trees based on the gini index. Comput. Stat. Data Anal. 52, 483–501 (2007). https://doi.org/10.1016/j.csda.2006.12.030

    Article  MathSciNet  MATH  Google Scholar 

  24. Subbotin, S.: Methods of sampling based on exhaustive and evolutionary search. Autom. Control Comput. Sci. 47, 113–121 (2013)

    Article  Google Scholar 

  25. Subbotin, S.: The neuro-fuzzy network synthesis and simplification on precedents in problems of diagnosis and pattern recognition. Opt. Mem. Neural Netw. 22, 97–103 (2013)

    Article  Google Scholar 

  26. Subbotin, Sergey A., Oliinyk, Andrii A.: The dimensionality reduction methods based on computational intelligence in problems of object classification and diagnosis. In: Szewczyk, Roman, Kaliczyńska, Małgorzata (eds.) SCIT 2016. AISC, vol. 543, pp. 11–19. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-48923-0_2

    Chapter  Google Scholar 

  27. Subbotin, S.: Methods and characteristics of localitypreserving transformations in the problems of computational intelligence. Radio Electron. Comput. Sci. Control 1, 120–128 (2014)

    Google Scholar 

  28. Subbotin, S.: Construction of decision trees for the case of low-information features. Radio Electron. Comput. Sci. Control 1, 121–130 (2019)

    Google Scholar 

  29. Vasilenko, Y., Povkhan, I.: Approximation of the training. Sci. J. 9–17 (1998). UzhIIEL

    Google Scholar 

  30. Vasilenko, Y., Vasilenko, E., Povkhan, I.: Defining the concept of a feature in pattern recognition theory. Artif. Intell. 4, 512–517 (2002)

    Google Scholar 

  31. Vasilenko, Y., Vasilenko, E., Povkhan, I.: Branched feature selection method in mathematical modeling of multi-level image recognition systems. Artif. Intell. 7, 246–249 (2003)

    Google Scholar 

  32. Vasilenko, Y., Vasilenko, E., Povkhan, I.: Conceptual basis of image recognition systems based on the branched feature selection method. Eur. J. Enterp. Technol. 7(1), 13–15 (2004)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Igor Povkhan .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Povkhan, I., Lupei, M., Kliap, M., Laver, V. (2020). The Issue of Efficient Generation of Generalized Features in Algorithmic Classification Tree Methods. In: Babichev, S., Peleshko, D., Vynokurova, O. (eds) Data Stream Mining & Processing. DSMP 2020. Communications in Computer and Information Science, vol 1158. Springer, Cham. https://doi.org/10.1007/978-3-030-61656-4_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-61656-4_6

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-61655-7

  • Online ISBN: 978-3-030-61656-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics