The Issue of Efficient Generation of Generalized Features in Algorithmic Classification Tree Methods

Povkhan, Igor; Lupei, Maksym; Kliap, Mykhailo; Laver, Vasyl

doi:10.1007/978-3-030-61656-4_6

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1158))

Included in the following conference series:

International Conference on Data Stream Mining and Processing

Abstract

The paper studies the basic issue of methods of constructing models of algorithmic classification trees which is the problem of generating generalized features in their structure. There has been suggested a simple and efficient method of approximating the initial training dataset with the help of a set of generalized features of the type of geometric objects in feature space of the current application problem. This approach allows ensuring the necessary accuracy of classification trees, reducing the structural (constructional) complexity and achieving the appropriate performance indicators of the model. Based on the proposed algorithmic scheme of the training set coverage there has been developed the corresponding software which enables the work with a set of different-type applied problems of data processing. Hence, a simple, efficient and economical approximation of the initial training set provides the appropriate speed, level of complexity of the classification scheme, which ensures the simple and complete recognition (coverage) of sets of discrete objects.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Alpaydin, E.: Introduction to Machine Learning. The MIT Press, Cambridge (2010)
MATH Google Scholar
Amit, Y., Wilder, K.: Joint induction of shape features and tree classifiers. IEEE Trans. Pattern Anal. Mach. Intell. 19(11), 1300–1305 (1997). https://doi.org/10.1109/34.632990
Article Google Scholar
Bodyanskiy, Y., Vynokurova, O., Setlak, G., Pliss, I.: Hybrid neuro-neo-fuzzy system and its adaptive learning algorithm. In: 2015 Xth International Scientific and Technical Conference “Computer Sciences and Information Technologies” (CSIT), pp. 111–114 (2015)
Google Scholar
Breiman, L., Friedman, J., Olshen, R., Stone, C.: (1984)
Google Scholar
Deng, Houtao, Runger, George, Tuv, Eugene: Bias of importance measures for multi-valued attributes and solutions. In: Honkela, Timo, Duch, Włodzisław, Girolami, Mark, Kaski, Samuel (eds.) ICANN 2011. LNCS, vol. 6792, pp. 293–300. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-21738-8_38
Chapter Google Scholar
Hastie, Trevor, Tibshirani, Robert, Friedman, Jerome: The Elements of Statistical Learning. SSS. Springer, New York (2009). https://doi.org/10.1007/978-0-387-84858-7
Book MATH Google Scholar
Hyunjoong, K., Wei-Yin, L.: Classification trees with unbiased multiway splits. J. Am. Stat. Assoc. 96(454), 589–604 (2001). https://doi.org/10.1198/016214501753168271
Article MathSciNet Google Scholar
Kaminski, B., Jakubczyk, M., Szufel, P.: A framework for sensitivity analysis of decision trees. Cent. Eur. J. Oper. Res. 26(1), 135–159 (2018). https://doi.org/10.1007/s10100-017-0479-6
Article MathSciNet MATH Google Scholar
Karimi, K., Hamilton, H.: Generation and interpretation of temporal decision rules. Int. J. Comput. Inf. Syst. Ind. Manag. Appl. 3, 314–323 (2011)
Google Scholar
Koskimaki, H., Juutilainen, I., Laurinen, P., Roning, J.: Two-level clustering approach to training data instance selection: a case study for the steel industry. In: 2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence), pp. 3044–3049 (2008)
Google Scholar
Kotsiantis, S.: Supervised machine learning: a review of classification techniques. Informatica. Int. J. Comput. Inform. 31(3), 249–268 (2007)
MathSciNet MATH Google Scholar
Laver, V., Povkhan, I.: The algorithms for constructing a logical tree of classification in pattern recognition problems. In: Scientific Notes of the Tauride National University. Series: Technical Sciences 4, pp. 100–106 (2019)
Google Scholar
Lupei, M., Mitsa, A., Repariuk, V., Sharkan, V.: Identification of authorship of Ukrainian-language texts of journalistic style using neural networks. East.-Eur. J. Enterp. Technol. 1, 30–36 (2020). https://doi.org/10.15587/1729-4061.2020.195041
Article Google Scholar
Lopez de Mantaras, R.: A distance-based attribute selection measure for decision tree induction. Mach. Learn. 6, 81–92 (1991). https://doi.org/10.1023/A:1022694001379
Mingers, J.: An empirical comparison of selection measures for decision-tree induction. Mach. Learn. 3(4), 319–342 (1989). https://doi.org/10.1007/BF00116837
Article Google Scholar
Miyakawa, M.: Criteria for selecting a variable in the construction of efficient decision trees. Trans. Comput. 38(1), 130–141 (1989). https://doi.org/10.1109/12.8736
Article MathSciNet MATH Google Scholar
Painsky, A., Rosset, S.: Cross-validated variable selection in tree-based methods improves predictive performance. IEEE Trans. Pattern Anal. Mach. Intell. 39, 2142–2153 (2017)
Article Google Scholar
Povhan, I.: Designing of recognition system of discrete objects. In: IEEE First International Conference on Data Stream Mining and Processing (DSMP), Lviv, Ukraine, pp. 226—231 (2016)
Google Scholar
Povhan, I.: General scheme for constructing the most complex logical tree of classification in pattern recognition discrete objects. In: Electronics and information technologies. Collection of Scientific Papers, vol. 11, pp. 112–117. Ivan Franko National University of Lviv (2019). https://doi.org/10.30970/eli.11.7
Povkhan, I.: Features of synthesis of generalized features in the construction of recognition systems using the logical tree method. In: Proceeding of the International Scientific and Practical Conference “Information Technologies and Computer Modeling ITKM-2019”, pp. 169–174 (2019)
Google Scholar
Quinlan, J.: Induction of decision trees. Mach. Learn. 1(1), 81–106 (1986). https://doi.org/10.1023/A:1022643204877
Article Google Scholar
Srikant, R., Agrawal, R.: Mining generalized association rules. In: Proceedings of the 21th International Conference on Very Large Data Bases, pp. 407—419. Morgan Kaufmann Publishers Inc. (1995)
Google Scholar
Strobl, C., Boulesteix, A.L., Augustin, T.: Unbiased split selection for classification trees based on the gini index. Comput. Stat. Data Anal. 52, 483–501 (2007). https://doi.org/10.1016/j.csda.2006.12.030
Article MathSciNet MATH Google Scholar
Subbotin, S.: Methods of sampling based on exhaustive and evolutionary search. Autom. Control Comput. Sci. 47, 113–121 (2013)
Article Google Scholar
Subbotin, S.: The neuro-fuzzy network synthesis and simplification on precedents in problems of diagnosis and pattern recognition. Opt. Mem. Neural Netw. 22, 97–103 (2013)
Article Google Scholar
Subbotin, Sergey A., Oliinyk, Andrii A.: The dimensionality reduction methods based on computational intelligence in problems of object classification and diagnosis. In: Szewczyk, Roman, Kaliczyńska, Małgorzata (eds.) SCIT 2016. AISC, vol. 543, pp. 11–19. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-48923-0_2
Chapter Google Scholar
Subbotin, S.: Methods and characteristics of localitypreserving transformations in the problems of computational intelligence. Radio Electron. Comput. Sci. Control 1, 120–128 (2014)
Google Scholar
Subbotin, S.: Construction of decision trees for the case of low-information features. Radio Electron. Comput. Sci. Control 1, 121–130 (2019)
Google Scholar
Vasilenko, Y., Povkhan, I.: Approximation of the training. Sci. J. 9–17 (1998). UzhIIEL
Google Scholar
Vasilenko, Y., Vasilenko, E., Povkhan, I.: Defining the concept of a feature in pattern recognition theory. Artif. Intell. 4, 512–517 (2002)
Google Scholar
Vasilenko, Y., Vasilenko, E., Povkhan, I.: Branched feature selection method in mathematical modeling of multi-level image recognition systems. Artif. Intell. 7, 246–249 (2003)
Google Scholar
Vasilenko, Y., Vasilenko, E., Povkhan, I.: Conceptual basis of image recognition systems based on the branched feature selection method. Eur. J. Enterp. Technol. 7(1), 13–15 (2004)
Google Scholar

Download references

Author information

Authors and Affiliations

Uzhhorod National University, Uzhgorod, Ukraine
Igor Povkhan, Maksym Lupei, Mykhailo Kliap & Vasyl Laver

Authors

Igor Povkhan
View author publications
You can also search for this author in PubMed Google Scholar
Maksym Lupei
View author publications
You can also search for this author in PubMed Google Scholar
Mykhailo Kliap
View author publications
You can also search for this author in PubMed Google Scholar
Vasyl Laver
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Igor Povkhan .

Editor information

Editors and Affiliations

Department of Informatics, Univerzita Jana Evangelisty Purkyně v Ústí nad Labem, Ústí nad Labem, Czech Republic
Sergii Babichev
GeoGuard, Kharkiv, Ukraine
Dmytro Peleshko
Kharkiv National University of Radio Electronics, Kharkiv, Ukraine
Olena Vynokurova

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Povkhan, I., Lupei, M., Kliap, M., Laver, V. (2020). The Issue of Efficient Generation of Generalized Features in Algorithmic Classification Tree Methods. In: Babichev, S., Peleshko, D., Vynokurova, O. (eds) Data Stream Mining & Processing. DSMP 2020. Communications in Computer and Information Science, vol 1158. Springer, Cham. https://doi.org/10.1007/978-3-030-61656-4_6

Download citation

DOI: https://doi.org/10.1007/978-3-030-61656-4_6
Published: 05 November 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-61655-7
Online ISBN: 978-3-030-61656-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics