Skip to main content
Log in

Hierarchical feature selection based on relative dependency for gear fault diagnosis

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Feature selection is an important aspect under study in machine learning based diagnosis, that aims to remove irrelevant features for reaching good performance in the diagnostic systems. The behaviour of diagnostic models could be sensitive with regard to the amount of features, and significant features can represent the problem better than the entire set. Consequently, algorithms to identify these features are valuable contributions. This work deals with the feature selection problem through attribute clustering. The proposed algorithm is inspired by existing approaches, where the relative dependency between attributes is used to calculate dissimilarity values. The centroids of the created clusters are selected as representative attributes. The selection algorithm uses a random process for proposing centroid candidates, in this way, the inherent exploration in random search is included. A hierarchical procedure is proposed for implementing this algorithm. In each level of the hierarchy, the entire set of available attributes is split in disjoint sets and the selection process is applied on each subset. Once the significant attributes are proposed for each subset, a new set of available attributes is created and the selection process runs again in the next level. The hierarchical implementation aims to refine the search space in each level on a reduced set of selected attributes, while the computational time-consumption is improved also. The approach is tested with real data collected from a test bed, results show that the diagnosis precision by using a Random Forest based classifier is over 98 % with only 12 % of the attributes from the available set.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Notes

  1. For simplicity, in this work the disjoint sets have been randomly selected and uniformly sized regarding the cardinality of the set A. However, other criteria could be applied to decompose the set A, e.g., by grouping features with related meaning.

References

  1. Bartkowiak A, Zimroz R (2014) Dimensionality reduction via variables selection linear and nonlinear approaches with application to vibration-based condition monitoring of planetary gearbox. Appl Acoust 77:169–177

    Article  Google Scholar 

  2. Benot F, van Heeswijk M, Miche Y, Verleysen M, Lendasse A (2013) Feature selection for nonlinear models with extreme learning machines. Neurocomputing 102:111–124. Advances in extreme learning machines (ELM 2011)

    Article  Google Scholar 

  3. Bordoloi D, Tiwari R (2014) Support vector machine based optimization of multi-fault classification of gears with evolutionary algorithms from time frequency vibration data. Measurement 55:1–14

    Article  Google Scholar 

  4. Breiman L (2001) Random forests. Mach Learn 45(1):5–32

    Article  MathSciNet  MATH  Google Scholar 

  5. Cabrera D, Sancho F, Sánchez RV, Zurita G, Cerrada M, Li C, Vásquez RE (2015) Fault diagnosis of spur gearbox based on random forest and wavelet packet decomposition. Front Mech Eng. doi:10.1007/s11465-015-0348-8

  6. Cerrada M, Sánchez RV, Cabrera D, Zurita G, Li C (2015) Multi-stage feature selection by using genetic algorithms for fault diagnosis in gearboxes based on vibration signal. Sensors 15(9):23,903–23,926

    Article  Google Scholar 

  7. Cerrada M, Zurita G, Cabrera D, Sánchez RV, Artés M, Li C (2015) Fault diagnosis in spur gears based on genetic algorithm and random forest. Mech Syst Signal Process. doi:10.1016/j.ymssp.2015.08.030

  8. Chandrashekar G, Sahin F (2014) A survey on feature selection methods. Comput Electr Eng 40:16–28

    Article  Google Scholar 

  9. Fazayeli F, Wang L, Mandziuk J (2008) Feature selection based on the rough set theory and expectation-maximization clustering algorithm. In: Chan CC, Grzymala-Busse J, Ziarko W (eds) Rough sets and current trends in computing. Lecture Notes in Computer Science, vol 5306, pp 272–282

  10. Ganivada A, Ray SS, Pal SK (2013) Fuzzy rough sets, and a granular neural network for unsupervised feature selection. Neural Netw 48:91–108

    Article  MATH  Google Scholar 

  11. Gryllias K, Antoniadis I (2012) A support vector machine approach based on physical model training for rolling element bearing fault detection in industrial environments. Eng Appl Artif Intell 25(2):326–344

    Article  Google Scholar 

  12. Han J, Hu X, Lin T (2004) Feature subset selection based on relative dependency between attributes. In: Tsumoto S, Sowiski R, Komorowski J, Grzymaa-Busse J (eds) Rough sets and current trends in computing. Lecture notes in computer science, vol 3066. Springer, Berlin Heidelberg, pp 176–185

  13. Hastie T, Tibshirani R, Friedman J (2009) The elements of statistical learning: data mining, inference and prediction. Springer, New York

    Book  MATH  Google Scholar 

  14. Hong TP, Liou YL, Wang SL, Vo B (2014) Feature selection and replacement by clustering attributes. Vietnam Journal of Computer Science 1(1):47–55

    Article  Google Scholar 

  15. Inbarani H, Bagyamathi M, Azar A (2015) A novel hybrid feature selection method based on rough set and improved harmony search. Neural Comput & Applic:1–22

  16. Jensen R, Shen Q (2008) Computational intelligence and features selection: rough and fuzzy approaches. Wiley, New Jersey

    Book  Google Scholar 

  17. Karabadji N, Khelf I, Seridi H, Laouar L (2012) Genetic optimization of decision tree choice for fault diagnosis in an industrial ventilator. In: Fakhfakh T, Bartelmus W, Chaari F, Zimroz R, Haddar M (eds) Condition monitoring of machinery in non-stationary operations, pp 277–283

  18. Li C, Liang M, Wang T (2015) Criterion fusion for spectral segmentation and its application to optimal demodulation of bearing vibration signals. Mech Syst Signal Process 6465:132–148

    Article  Google Scholar 

  19. Li C, Sanchez RV, Zurita G, Cerrada M, Cabrera D, Vasquez RE (2015) Multimodal deep support vector classification with homologous features and its application to gearbox fault diagnosis. Neurocomputing 168:119–127

    Article  Google Scholar 

  20. Li Y, Ngom A (2013) The non-negative matrix factorization toolbox for biological data mining. Source Code Biol Med 8(10)

  21. Liu C, Jiang D, Yang W (2014) Global geometric similarity scheme for feature selection in fault diagnosis. Expert Syst Appl 41(8):3585–3595

    Article  Google Scholar 

  22. Liu H, Yu L (2005) Toward integrating feature selection algorithms for classification and clustering. IEEE Trans Knowl Data Eng 17(4):491–502

    Article  Google Scholar 

  23. Liu Z, Qu J, Zuo M, Hb X u (2013) Fault level diagnosis for planetary gearboxes using hybrid kernel feature selection and kernel fisher discriminant analysis. Int J Adv Manuf Technol 67(5–8):1217–1230

    Article  Google Scholar 

  24. Liu Z, Zhao X, Zuo M, Xu H (2014) Feature selection for fault level diagnosis of planetary gearboxes. ADAC 8(4):377–401

    Article  MathSciNet  Google Scholar 

  25. van der Maaten L, Postma EO, van den Herik HJ (2009) Dimensionality reduction: a comparative review. Tech. rep., Tilburg University Technical Report, TiCC-TR 2009–005

  26. Mac Parthaláin N, Jensen R (2013) Unsupervised fuzzy-rough set-based dimensionality reduction. Inf Sci 229:106–121

    Article  MathSciNet  MATH  Google Scholar 

  27. Mallat S (2009) A wavelet tour of signal processing: the sparse way. Elsevier Academic Press, Amsterdam

    MATH  Google Scholar 

  28. Mitchell T (1997) Machine learning. McGraw-Hill, New York

    MATH  Google Scholar 

  29. Mitra S (2011) Digital signal processing: a computer-based approach. McGraw-Hill, New York

    Google Scholar 

  30. Muralidharan V, Sugumaran V (2013) Feature extraction using wavelets and classification through decision tree algorithm for fault diagnosis of mono-block centrifugal pump. Measurement 46(1):353–359

    Article  Google Scholar 

  31. Muralidharan V, Sugumaran V, Indira V (2014) Fault diagnosis of monoblock centrifugal pump using SVM. Int J Eng Sci Technol 17(3):152–157

    Article  Google Scholar 

  32. Pawlak Z (1982) Rough sets. Int J Comput Inform Sci 11(5):341–356

    Article  MathSciNet  MATH  Google Scholar 

  33. Qin H, Ma X, Zain JM, Herawan T (2012) A novel soft set approach in selecting clustering attribute. Knowl-Based Syst 36:139–145

    Article  Google Scholar 

  34. Rajeswari C, Sathiyabhama B, Devendiran S, Manivannan K (2013) Fault gear categorization: a comparative study on feature classification using rough set theory and ID3. Int J Artif Intell Appl Smart Devices 97:41–64. 12th Global Congress on Manufacturing and Management (GCMM)-2014

    Google Scholar 

  35. Rajeswari C, Sathiyabhama B, Devendiran S, Manivannan K (2014) A gear fault identification using wavelet transform, rough set based GA, ANN and C4.5 algorithm. Procedia Eng 97:1831–1841. 12th Global Congress on Manufacturing and Management (GCMM)-2014

    Article  Google Scholar 

  36. Raymer M, Punch W, Goodman E, Kuhn L, Jain A (2000) Dimensionality reduction using genetic algorithms. IEEE Trans Evol Comput 4(2):164–171

    Article  Google Scholar 

  37. Roman S (2001) Rough sets methods in feature reduction and classification. Int J Appl Math Comput Sci 11:565–582

    MathSciNet  MATH  Google Scholar 

  38. Sakthivel N, Sugumaran V, Nair BB (2010) Comparison of decision tree-fuzzy and rough set-fuzzy methods for fault categorization of mono-block centrifugal pump. Mech Syst Signal Process 24(6):1887–1906

    Article  Google Scholar 

  39. Verikas A, Gelzinis A, Bacauskiene M (2011) Mining data with random forests: a survey and results of new tests. Pattern Recogn. 44(2):330–349

    Article  Google Scholar 

  40. Wang S, Pedrycz W, Zhu Q, Zhu W (2015) Unsupervised feature selection via maximum projection and minimum redundancy. Knowl-Based Syst 75:19–29

    Article  Google Scholar 

  41. Witten IH, Frank E (2005) Data mining: practical machine learning tools and techniques. Morgan Kaufman, Boston

    MATH  Google Scholar 

  42. Yan R, Gao RX, Chen X (2014) Wavelets for fault diagnosis of rotary machines: a review with applications. Signal Process 96:1–15

    Article  Google Scholar 

  43. Yang BS, Di X, Han T (2008) Random forests classifier for machine fault diagnosis. J Mech Sci Technol 22(9):1716–1725

    Article  Google Scholar 

  44. Yoon H, Park CS, Kim JS, Baek JG (2013) Algorithm learning based neural network integrating feature selection and classification. Expert Syst Appl 40(1):231–241

    Article  Google Scholar 

  45. Zhu X, Zhang Y, Zhu Y (2012) Intelligent fault diagnosis of rolling bearing based on kernel neighborhood rough sets and statistical features. J Mech Sci Technol 26(9):2649–2657

    Article  Google Scholar 

  46. Ziegler A, Knig IR (2013) Mining data with random forests: current options for real-world applications. Wiley Interdiscip Rev Data Min Knowl Discov 4(1):55–63

    Article  Google Scholar 

Download references

Acknowledgments

The authors want to express a deep gratitude to The Secretary of Higher Education, Science, Technology and Innovation (SENESCYT) of the Republic of Ecuador and the Prometeo program, for their support in this research work. We also acknowledge the support of the GIDTEC research group of the Universidad Politécnica Salesiana in Cuenca-Ecuador, for the accomplishment of this research.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mariela Cerrada.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Cerrada, M., Sánchez, RV., Pacheco, F. et al. Hierarchical feature selection based on relative dependency for gear fault diagnosis. Appl Intell 44, 687–703 (2016). https://doi.org/10.1007/s10489-015-0725-3

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-015-0725-3

Keywords

Navigation