Skip to main content

Advertisement

Log in

Supervised Kohonen network with heterogeneous value difference metric for both numeric and categorical inputs

  • Methodologies and Application
  • Published:
Soft Computing Aims and scope Submit manuscript

Abstract

The multi-attribute information appears in real world, which also includes numeric and categorical attributes. However, the previous classification algorithms for both numeric and categorical data exist in some limitations on categorical data. In this paper, a supervised Kohonen network with heterogeneous value difference metric is proposed for both numeric and categorical inputs. It employs the framework of supervised Kohonen networks, adopts heterogeneous value difference metric to measure dissimilarity between numeric and categorical data, uses the frequency of each categorical item in the Voronoi set to update the reference vector of categorical attribute on the competitive layer, and updates different competitive learning rules for numeric and categorical data. The effectiveness of the proposed algorithm is verified by UCI Machine Learning Data Repository. The classification accuracy is compared with BP, k-NN, naive Bayes network, C4.5 and SVM; the dissimilarity metric is analyzed. The proposed classification algorithm is applied to the operating mode classification for wind turbines; the effectiveness is illustrated in condition monitoring for pitch system of wind turbines.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  • Ahmad A, Dey L (2007) A k-mean clustering algorithm for mixed numeric and categorical data. Data Knowl Eng 63(2):503–527

    Article  Google Scholar 

  • Baati K, Hamdani TM, Alimi AM, Abraham A (2017) A new classifier for categorical data based on a possibilistic estimation and a novel generalized minimum-based algorithm. J Intell Fuzzy Syst 33(3):1723–1731

    Article  Google Scholar 

  • Belyi D, Popova E, Morton DP, Damien P (2017) Bayesian failure-rate modeling and preventive maintenance optimization. Eur J Oper Res 262(3):1085–1093

    Article  MathSciNet  Google Scholar 

  • Chen Y, Pazner MI, Wu W (2007) A comparison between a modified counter propagation network and an extended self-organizing map in remotely sensed data classification. Math Geol 39(6):559–574

    Article  Google Scholar 

  • De Leon AR, Soo A, Williamson T (2011) Classification with discrete and continuous variables via general mixed-data models. J Appl Stat 38(5):1021–1032

    Article  MathSciNet  Google Scholar 

  • Garcia S, Herrera F (2008) An extension on statistical comparisons of classifiers over multiple data sets for all pairwise comparisons. J Mach Learn Res 9:2677–2694

    MATH  Google Scholar 

  • Hsu CC, Huang YP, Chang KW (2008) Extended Naive Bayes classifier for mixed data. Expert Syst Appl 35(3):1080–1083

    Article  Google Scholar 

  • Jabeen H, Baig AR (2012) Two layered Genetic programming for mixed-attribute data classification. Appl Soft Comput 12(1):416–422

    Article  Google Scholar 

  • Jain AK (2010) Data clustering: 50 years beyond K-means. Pattern Recognit Lett 31(8):651–666

    Article  Google Scholar 

  • Jiao L, Pan Q, Denœux T, Liang Y, Feng X (2015) Belief rule-based classification system: extension of FRBCS in belief functions framework. Inf Sci 309:26–49

    Article  Google Scholar 

  • Kim K, Hong JS (2017) A hybrid decision tree algorithm for mixed numeric and categorical data in regression analysis. Pattern Recognit Lett 98:39–45

    Article  Google Scholar 

  • Kohonen T (1998) The self-organizing map. Neurocomputing 21(1–3):1–6

    Article  Google Scholar 

  • Kohonen T (2013) Essentials of the self-organizing map. Neural Netw 37:52–65

    Article  Google Scholar 

  • Li C, Biswas G (2002) Unsupervised learning with mixed numeric and nominal data. IEEE Trans Knowl Data Eng 4:673–690

    Article  Google Scholar 

  • Liu H, Wu Y, Sun F, Fang B, Guo D (2018a) Weakly paired multimodal fusion for object recognition. IEEE Trans Autom Sci Eng 15(2):784–795

    Article  Google Scholar 

  • Liu H, Li F, Xu X, Sun F (2018b) Multi-modal local receptive field extreme learning machine for object recognition. Neurocomputing 277:4–11

    Article  Google Scholar 

  • Masmoudi Y, Türkay M, Chabchoub H (2013). A binarization strategy for modelling mixed data in multigroup classification. In: Proceedings of international conference on advanced logistics and transport. IEEE, Sousse, May 2013, pp 347–353

  • McCane B, Albert M (2008) Distance functions for categorical and mixed variables. Pattern Recognit Lett 29(7):986–993

    Article  Google Scholar 

  • Melssen W, Wehrens R, Buydens L (2006) Supervised Kohonen networks for classification problems. Chemom Intell Lab Syst 83(2):99–113

    Article  Google Scholar 

  • Nouaouria N, Boukadoum M (2014) Improved global-best particle swarm optimization algorithm with mixed-attribute data classification capability. Appl Soft Comput 21:554–567

    Article  Google Scholar 

  • Pathak A, Pal NR (2016) Clustering of mixed data by integrating fuzzy, probabilistic, and collaborative clustering framework. Int J Fuzzy Syst 18(3):339–348

    Article  Google Scholar 

  • Qiao W, Lu D (2015) A survey on wind turbine condition monitoring and fault diagnosis—part I: components and subsystems. IEEE Trans Ind Electron 62(10):6536–6545

    Article  Google Scholar 

  • Qiu Y, Feng Y, Tavner P, Richardson P, Erdos G, Chen B (2012) Wind turbine SCADA alarm analysis for improving reliability. Wind Energy 15(8):951–966

    Article  Google Scholar 

  • Schlechtingen M, Santos IF (2011) Comparative analysis of neural network and regression based condition monitoring approaches for wind turbine fault detection. Mech Syst Signal Process 25(5):1849–1875

    Article  Google Scholar 

  • Schlechtingen M, Santos IF, Achiche S (2013) Wind turbine condition monitoring based on SCADA data using normal behavior models. Part 1: system description. Appl Soft Comput 13(1):259–270

    Article  Google Scholar 

  • Sun P, Li J, Wang C, Lei X (2016) A generalized model for wind turbine anomaly identification based on SCADA data. Appl Energy 168:550–567

    Article  Google Scholar 

  • Villuendas-Rey Y, Rey-Benguría CF, Ferreira-Santiago Á, Camacho-Nieto O, Yáñez-Márquez C (2017) The naïve associative classifier (NAC): a novel, simple, transparent, and accurate classification model evaluated on financial data. Neurocomputing 265:105–115

    Article  Google Scholar 

  • Wang H (2006) Nearest neighbors by neighborhood counting. IEEE Trans Pattern Anal Mach Intell 28(6):942–953

    Article  Google Scholar 

  • Wilson DR, Martinez TR (1997) Improved heterogeneous distance functions. J Artif Intell Res 6:1–34

    Article  MathSciNet  Google Scholar 

  • Yang W, Tavner PJ, Crabtree CJ, Feng Y, Qiu Y (2014) Wind turbine condition monitoring: technical and commercial challenges. Wind Energy 17(5):673–693

    Article  Google Scholar 

Download references

Acknowledgments

This research is supported by National Natural Science Foundation of China (61102124), Natural Science Foundation of Liaoning Province (20180551032) and Educational Commission of Liaoning Province (LQGD2017035).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yuxian Zhang.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Informed consent

This article does not contain any studies with human participants or animals performed by any of the authors.

Additional information

Communicated by V. Loia.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhang, Y., Gendeel, M.A.A., Peng, H. et al. Supervised Kohonen network with heterogeneous value difference metric for both numeric and categorical inputs. Soft Comput 24, 1763–1774 (2020). https://doi.org/10.1007/s00500-019-04001-7

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00500-019-04001-7

Keywords

Navigation