Elsevier

Information Sciences

Volume 259, 20 February 2014, Pages 252-268
Information Sciences

Bayesian classifiers based on probability density estimation and their applications to simultaneous fault diagnosis

https://doi.org/10.1016/j.ins.2013.09.003Get rights and content

Abstract

A key characteristic of simultaneous fault diagnosis is that the features extracted from the original patterns are strongly dependent. This paper proposes a new model of Bayesian classifier, which removes the fundamental assumption of naive Bayesian, i.e., the independence among features. In our model, the optimal bandwidth selection is applied to estimate the class-conditional probability density function (p.d.f.), which is the essential part of joint p.d.f. estimation. Three well-known indices, i.e., classification accuracy, area under ROC curve, and probability mean square error, are used to measure the performance of our model in simultaneous fault diagnosis. Simulations show that our model is significantly superior to the traditional ones when the dependence exists among features.

Introduction

Fault diagnosis is the problem of detecting the potential faults hidden in the observed instances that are related to specific application domains. There are two types of fault diagnosis, i.e., single and simultaneous. In single fault diagnosis, only one fault may appear in an observed instance, while in simultaneous fault diagnosis, multiple faults may appear in an observed instance. Single fault diagnosis has been well studied in the past decade and has been applied to various domains, such as generator winding protection [1], chemical process [18], electrical machine [16], active magnetic bearing [20], power transformer [28], and field air defense gun [2]. Currently, with the development of science/technology, there is a stronger need on the safety and reliability of modern equipments. Unlike the traditional single fault generation, different faults often occur simultaneously in modern equipments due to various factors. Consequentially, these faults may cause serious accidents (e.g., air disasters, marine disasters, explosion accidents, collapse accidents, leakage accidents, and so on) that not only lead to great economic losses but also heavy casualties and environmental pollution. Therefore, an effective methodology is required to recognize the potential simultaneous faults in order to avoid such accidents. However, it is very difficult to conduct simultaneous fault diagnosis accurately and effectively due to the complex combination, mixture, and disturbance of features that reflect the single faults. A comprehensive reference-search finds that only a few literatures [4], [10], [24], [27], [29] exist to tackle this problem. These methods usually use the qualitative causal or quantitative analytical models to identify the simultaneous faults. Although a good solution is provided, these models usually cannot work well in practical applications. Meanwhile, the model parameters are also hard to determine.

The main methodologies for handling simultaneous fault diagnosis include artificial neural networks (ANNs) [4], [24], support vector machines (SVMs) [27], [29], and Dempster–Shafer theory (DST) [10]. Different models have been designed for specific problems, i.e., chemical reactor [4], [24], chemical plant [29], and multi-function rotor [10]. However, there are two disadvantages of the existing models. (1) The computational complexity for learning their parameters are high. Given that N is the size of training set, the training complexities of ANNs, SVMs, and DST are O(N2), O(N3), and O(N2) respectively, which make them unable to deal with large data. (2) They often neglect the necessary dependence among features in the observed instances, which exist in most practical applications. For example, in heart-disease electrocardiogram (ECG) [22], there is strong dependence between indices of vulnerability and heart rate [13]. These limitations motivate our idea in this paper to develop a novel simultaneous fault diagnosis model that can avoid the intractable complexity and take the dependence among features into account.

Naive Bayesian classifier (NBC) is a competent tool to deal with large data due to its simplicity, low computational complexity, and less memory requirement [3]. Applying NBC to fault diagnosis is an emergent research topic. Related studies on single fault diagnosis can be found from recent references [9], [12], [15], [17]. To our best knowledge, we are the first one who try to establish a simultaneous fault diagnosis model based on Bayesian classifiers. In order to deal with the dependence among features, a non-naive Bayesian classifier (NNBC) is proposed to diagnose the possible faults hidden in the observed instances. It establishes a model of joint p.d.f. that is estimated by using Parzen windows based on the multivariate kernel function. Specifically, the estimation is completed by seeking an optimal bandwidth for the Parzen window through minimizing the mean integrated squared error between the true p.d.f. and the estimated p.d.f.

Analysis reveals that the training complexity of NNBC is O(Nd), where N is the number of training instances and d is the number of conditional features. It shows that when d  N, NNBC can carry out the fault diagnosis with lower computational burden than ANNs [24], SVMs [29], and DST [10]. We compare our proposed NNBC with three p.d.f. density estimation based NBCs (normal naive Bayesian (NNB) [14], flexible naive Bayesian (FNB) [7], and the homologous model of FNB (FNBROT)) [11] in terms of three evaluation indices, i.e., classification accuracy, area under ROC curve (AUC) [5], [6], and probability mean square error (PMSE) [8]. The comparative results show that NNBC is uniformly and significantly superior to the other three models regarding the three indices, and therefore, provides a new way to design high-performance models for simultaneous fault diagnosis.

The rest of the paper is organized as follows: In Section 2, we summarize the basic naive Bayesian classifier algorithm. In Section 3, a non-naive Bayesian classification model based on the joint probability density estimation is proposed. In Section 4, we apply our proposed NNBC to simultaneous fault diagnosis. Finally, in Section 5, we conclude this paper and outline the main directions for future research.

Section snippets

A brief review on Bayesian classifiers

This section will give a brief review on naive Bayesian classifiers. We first introduce a number of denotations.

Let X be a set of N instances. Each instance is described by d condition attributes and one decision attribute. All the condition attributes are assumed to be continuous, and the decision attribute is supposed to be discrete. Suppose that the decision attribute takes values from {w1, w2,  , wc}, which implies that all instances are categorized into c classes. In this way, any instance in

Bayesian model based on joint probability density estimation

As it is mentioned in Section 2, the fundamental assumption in NBC is that all conditional attributes are independent. In this section, we will propose an improved Bayesian classification model based on joint p.d.f. estimation, i.e., non-naive Bayesian classifier (NNBC), which releases the assumption of attribute-independence. First, the basic concept of joint p.d.f. estimation is introduced. Then, the optimal parameter selection in the joint p.d.f. estimation is discussed. Finally, the NNBC

Application to simultaneous faults diagnosis

In this section, we first design a device that can generate instances with single and simultaneous faults, then we demonstrate the performance of NNBC in simultaneous fault diagnosis on instances with strong dependence among features.

Conclusion and future works

In this paper, we propose a new simultaneous fault diagnosis model based on non-naive Bayesian classifier (NNBC). It removes the independence assumption and achieves a more accurate estimation on class-conditional p.d.f.. The comparative results demonstrate that NNBC can obtain the remarkable improvements in the classification accuracy, ranking performance and class-conditional probability estimation. Our scheduled further development in this research topic contains: (i) seeking other

Acknowledgements

The authors thank the editors and anonymous reviewers. Their valuable and constructive comments and suggestions helped them in significantly improving this paper. The authors also thank Prof. James Liu for his instructions on the improvement of language quality. This research is supported by the National Natural Science Foundation of China (71371063, 61170040 and 60903089), by the Natural Science Foundation of Hebei Province (F2013201110, F2012201023 and F2011201063), by the Key Scientific

References (30)

  • D.J. Hand et al.

    A simple generalisation of the area under the ROC curve for multiple class classification problems

    Machine Learning

    (2001)
  • L.X. Jiang, Z.H. Cai, D.H. Wang, H. Zhang, Bayesian citation-KNN with distance weighting, International Journal of...
  • G.H. John, P. Langley, Estimating continuous distributions in Bayesian classifiers, in: Proceedings of UAI’95, Quebec,...
  • M. Kobos, Combination of independent kernel density estimators in classification, in: Proceedings of IMCSIT’09,...
  • Z.L. Li et al.

    A new approach of simultaneous faults diagnosis based on random sets and DSmT

    Journal of Electronics (China)

    (2009)
  • Cited by (65)

    • A novel intelligent fault diagnosis method of rolling bearing based on two-stream feature fusion convolutional neural network

      2021, Measurement: Journal of the International Measurement Confederation
      Citation Excerpt :

      They stated that an 81% accuracy was reached. He et al. [4] proposed a simultaneous fault diagnostic model Non-Naive Bayes Classifier (NNBC) for steel plate fault identification. A 62.7% accuracy of classification was reached.

    • A novel method for simultaneous-fault diagnosis based on between-class learning

      2021, Measurement: Journal of the International Measurement Confederation
    • FMDBN: A first-order Markov dynamic Bayesian network classifier with continuous attributes

      2020, Knowledge-Based Systems
      Citation Excerpt :

      Pérez et al. (2006, 2009) [10,11] developed the two classifiers proposed by John and Langley (1995) [9] by extending dependencies via adding edges between attributes. He et al. (2014) [12] presented a naïve Bayesian classifier based on Gaussian function, and both Luis et al. (2014) [13] and Wang et al. (2016) [14] put forward a complete Bayesian classifier and a Bayesian network classifier based on Gaussian kernel function to estimate the attribute density and applied them to spectral analysis, fault detection and root identification. Chen (2018) [15] proposed the kernel density estimation method to estimate the probability density function instead of learning the parameter as in the traditional Bayesian network classifiers and applied it to fault detection and root identification.

    View all citing articles on Scopus
    View full text