Bayesian classifiers based on probability density estimation and their applications to simultaneous fault diagnosis

doi:10.1016/j.ins.2013.09.003

Information Sciences

Volume 259, 20 February 2014, Pages 252-268

https://doi.org/10.1016/j.ins.2013.09.003 Get rights and content

Abstract

A key characteristic of simultaneous fault diagnosis is that the features extracted from the original patterns are strongly dependent. This paper proposes a new model of Bayesian classifier, which removes the fundamental assumption of naive Bayesian, i.e., the independence among features. In our model, the optimal bandwidth selection is applied to estimate the class-conditional probability density function (p.d.f.), which is the essential part of joint p.d.f. estimation. Three well-known indices, i.e., classification accuracy, area under ROC curve, and probability mean square error, are used to measure the performance of our model in simultaneous fault diagnosis. Simulations show that our model is significantly superior to the traditional ones when the dependence exists among features.

Introduction

Fault diagnosis is the problem of detecting the potential faults hidden in the observed instances that are related to specific application domains. There are two types of fault diagnosis, i.e., single and simultaneous. In single fault diagnosis, only one fault may appear in an observed instance, while in simultaneous fault diagnosis, multiple faults may appear in an observed instance. Single fault diagnosis has been well studied in the past decade and has been applied to various domains, such as generator winding protection [1], chemical process [18], electrical machine [16], active magnetic bearing [20], power transformer [28], and field air defense gun [2]. Currently, with the development of science/technology, there is a stronger need on the safety and reliability of modern equipments. Unlike the traditional single fault generation, different faults often occur simultaneously in modern equipments due to various factors. Consequentially, these faults may cause serious accidents (e.g., air disasters, marine disasters, explosion accidents, collapse accidents, leakage accidents, and so on) that not only lead to great economic losses but also heavy casualties and environmental pollution. Therefore, an effective methodology is required to recognize the potential simultaneous faults in order to avoid such accidents. However, it is very difficult to conduct simultaneous fault diagnosis accurately and effectively due to the complex combination, mixture, and disturbance of features that reflect the single faults. A comprehensive reference-search finds that only a few literatures [4], [10], [24], [27], [29] exist to tackle this problem. These methods usually use the qualitative causal or quantitative analytical models to identify the simultaneous faults. Although a good solution is provided, these models usually cannot work well in practical applications. Meanwhile, the model parameters are also hard to determine.

The main methodologies for handling simultaneous fault diagnosis include artificial neural networks (ANNs) [4], [24], support vector machines (SVMs) [27], [29], and Dempster–Shafer theory (DST) [10]. Different models have been designed for specific problems, i.e., chemical reactor [4], [24], chemical plant [29], and multi-function rotor [10]. However, there are two disadvantages of the existing models. (1) The computational complexity for learning their parameters are high. Given that N is the size of training set, the training complexities of ANNs, SVMs, and DST are O(N²), O(N³), and O(N²) respectively, which make them unable to deal with large data. (2) They often neglect the necessary dependence among features in the observed instances, which exist in most practical applications. For example, in heart-disease electrocardiogram (ECG) [22], there is strong dependence between indices of vulnerability and heart rate [13]. These limitations motivate our idea in this paper to develop a novel simultaneous fault diagnosis model that can avoid the intractable complexity and take the dependence among features into account.

Naive Bayesian classifier (NBC) is a competent tool to deal with large data due to its simplicity, low computational complexity, and less memory requirement [3]. Applying NBC to fault diagnosis is an emergent research topic. Related studies on single fault diagnosis can be found from recent references [9], [12], [15], [17]. To our best knowledge, we are the first one who try to establish a simultaneous fault diagnosis model based on Bayesian classifiers. In order to deal with the dependence among features, a non-naive Bayesian classifier (NNBC) is proposed to diagnose the possible faults hidden in the observed instances. It establishes a model of joint p.d.f. that is estimated by using Parzen windows based on the multivariate kernel function. Specifically, the estimation is completed by seeking an optimal bandwidth for the Parzen window through minimizing the mean integrated squared error between the true p.d.f. and the estimated p.d.f.

Analysis reveals that the training complexity of NNBC is O(Nd), where N is the number of training instances and d is the number of conditional features. It shows that when d ≪ N, NNBC can carry out the fault diagnosis with lower computational burden than ANNs [24], SVMs [29], and DST [10]. We compare our proposed NNBC with three p.d.f. density estimation based NBCs (normal naive Bayesian (NNB) [14], flexible naive Bayesian (FNB) [7], and the homologous model of FNB (FNB_ROT)) [11] in terms of three evaluation indices, i.e., classification accuracy, area under ROC curve (AUC) [5], [6], and probability mean square error (PMSE) [8]. The comparative results show that NNBC is uniformly and significantly superior to the other three models regarding the three indices, and therefore, provides a new way to design high-performance models for simultaneous fault diagnosis.

The rest of the paper is organized as follows: In Section 2, we summarize the basic naive Bayesian classifier algorithm. In Section 3, a non-naive Bayesian classification model based on the joint probability density estimation is proposed. In Section 4, we apply our proposed NNBC to simultaneous fault diagnosis. Finally, in Section 5, we conclude this paper and outline the main directions for future research.

Section snippets

A brief review on Bayesian classifiers

This section will give a brief review on naive Bayesian classifiers. We first introduce a number of denotations.

Let X be a set of N instances. Each instance is described by d condition attributes and one decision attribute. All the condition attributes are assumed to be continuous, and the decision attribute is supposed to be discrete. Suppose that the decision attribute takes values from {w₁, w₂, … , w_c}, which implies that all instances are categorized into c classes. In this way, any instance in

Bayesian model based on joint probability density estimation

As it is mentioned in Section 2, the fundamental assumption in NBC is that all conditional attributes are independent. In this section, we will propose an improved Bayesian classification model based on joint p.d.f. estimation, i.e., non-naive Bayesian classifier (NNBC), which releases the assumption of attribute-independence. First, the basic concept of joint p.d.f. estimation is introduced. Then, the optimal parameter selection in the joint p.d.f. estimation is discussed. Finally, the NNBC

Application to simultaneous faults diagnosis

In this section, we first design a device that can generate instances with single and simultaneous faults, then we demonstrate the performance of NNBC in simultaneous fault diagnosis on instances with strong dependence among features.

Conclusion and future works

In this paper, we propose a new simultaneous fault diagnosis model based on non-naive Bayesian classifier (NNBC). It removes the independence assumption and achieves a more accurate estimation on class-conditional p.d.f.. The comparative results demonstrate that NNBC can obtain the remarkable improvements in the classification accuracy, ranking performance and class-conditional probability estimation. Our scheduled further development in this research topic contains: (i) seeking other

Acknowledgements

The authors thank the editors and anonymous reviewers. Their valuable and constructive comments and suggestions helped them in significantly improving this paper. The authors also thank Prof. James Liu for his instructions on the improvement of language quality. This research is supported by the National Natural Science Foundation of China (71371063, 61170040 and 60903089), by the Natural Science Foundation of Hebei Province (F2013201110, F2012201023 and F2011201063), by the Key Scientific

References (30)

S. Deng et al.
Application of multiclass support vector machines for fault diagnosis of field air defense gun
Expert Systems with Applications
(2011)
A. Lemos et al.
Adaptive fault detection and diagnosis using an evolving fuzzy classifier
Information Sciences
(2013)
M.S. Mahmoud et al.
Expectation maximization approach to data-based fault diagnostics
Information Sciences
(2013)
V. Muralidharan et al.
A comparative study of Naive Bayes classifier and Bayes net classifier for fault diagnosis of monoblock centrifugal pump using wavelet analysis
Applied Soft Computing
(2012)
H. Peng et al.
Fuzzy reasoning spiking neural P system for fault diagnosis
Information Sciences
(2013)
Y. Qian et al.
An expert system for real-time fault diagnosis of complex chemical processes
Expert Systems with Applications
(2003)
N.C. Tsai et al.
Fault diagnosis for magnetic bearing systems
Mechanical Systems and Signal Processing
(2009)
H.A. Darwish et al.
Development and implementation of an ANN-based fault diagnosis scheme for generator winding protection
IEEE Transactions on Power Delivery
(2001)
P. Domingos et al.
On the optimality of the simple Bayesian classifier under zero-one loss
Machine Learning
(1997)
R. Eslamloueyan et al.
Multiple simultaneous fault diagnosis via hierarchical and single artificial neural networks
Scientia Iranica
(2003)

D.J. Hand et al.

A simple generalisation of the area under the ROC curve for multiple class classification problems

Machine Learning

(2001)

L.X. Jiang, Z.H. Cai, D.H. Wang, H. Zhang, Bayesian citation-KNN with distance weighting, International Journal of...

G.H. John, P. Langley, Estimating continuous distributions in Bayesian classifiers, in: Proceedings of UAI’95, Quebec,...

M. Kobos, Combination of independent kernel density estimators in classification, in: Proceedings of IMCSIT’09,...

Z.L. Li et al.

A new approach of simultaneous faults diagnosis based on random sets and DSmT

Journal of Electronics (China)

(2009)

Cited by (65)

A multiple kernel-based kernel density estimator for multimodal probability density functions
2024, Engineering Applications of Artificial Intelligence
The performance of the single kernel-based kernel density estimator (SK-KDE) in fitting a unimodal probability density function (PDF) depends on the choice of kernel function and the corresponding selection of kernel bandwidth. Unlike unimodal PDFs, a multimodal PDF has several distinct features. First, it has multiple local maxima. Second, it is composed of various unimodal PDFs. Each of these unimodal PDFs corresponds to a different collection of random variables. Importantly, these variables are not independent and identically distributed. Because of the difficulty in quantifying multimodality among different modes, it is extremely difficult to select an appropriate kernel function and optimal kernel bandwidth for the multimodal PDF. Multimodal PDFs are frequently encountered in real-world applications. To address this, this paper proposes a novel multiple kernel-based kernel density estimator (MK-KDE). It constructs a flexible KDE by using the weighted average of multiple kernels with consideration of their kernel efficiencies. By integrating multiple kernels, MK-KDE leverages their complementary strengths to enhance the estimation of complex and multimodal PDFs. To achieve this, a novel efficient objective function is designed to obtain the optimized kernel weights and kernel bandwidths by minimizing both the global estimation error of MK-KDE and the local estimation errors of SK-KDEs. Moreover, a sophisticated $k$ -nearest neighbor strategy is devised as a heuristic method to determine the unknown PDF values of given data points, thereby optimizing the aforementioned objective function. A series of extensive experiments was conducted to validate the feasibility, rationality, and effectiveness of MK-KDE for 10 multimodal PDFs. The experimental results show that (1) the kernel weights and bandwidths of MK-KDE converge as the iteration number of the optimization algorithm increases; (2) MK-KDE can fit multimodal PDFs by automatically selecting the kernel functions and bandwidths; and (3) MK-KDE obtains lower estimation errors on 10 multimodal PDFs in comparison to 10 existing PDF estimation methods, demonstrating that MK-KDE is a viable approach to estimate multimodal PDFs.
Development of Intelligent Fault-Tolerant Control Systems with Machine Learning, Deep Learning, and Transfer Learning Algorithms: A Review
2024, Expert Systems with Applications
Intelligent Fault-Tolerant Control (IFTC) refers to the applications of machine learning algorithms for fault diagnosis and design of Fault-Tolerant Control (FTC). The overall goal of the FTC is to accommodate defects in the system components while they are in use and maintain stability with little to no performance reduction. These systems are crucial for mission-critical and safety-related applications where the safety of people is at stake and service continuity is crucial. In this review paper, a systematic study has been done for the development of FTC with machine learning, deep learning, and transfer learning algorithms. The challenges and limitations faced with their possible solutions through machine learning theories for the IFTC model are lined up. This paper guides researchers on the different possible types of machine learning algorithms and their advanced forms like deep learning and transfer learning. The differences among these are highlighted by the challenges and limitations of each. The paper is significant such that most of the important literature references from the Scopus database particularly related to important electrical and mechanical industrial problems have been discussed to guide the researchers who want to apply IFTC for specific industrial problems, being the research gap. Finally, future research directions for the development of IFTC are highlighted.
A novel intelligent fault diagnosis method of rolling bearing based on two-stream feature fusion convolutional neural network
2021, Measurement: Journal of the International Measurement Confederation
Citation Excerpt :
They stated that an 81% accuracy was reached. He et al. [4] proposed a simultaneous fault diagnostic model Non-Naive Bayes Classifier (NNBC) for steel plate fault identification. A 62.7% accuracy of classification was reached.
Previous bearing fault diagnosis models show either low accuracy or long iterations, which are not suitable for real-time production quality control scenarios lacking computing resources. In this paper, the Two-Stream Feature Fusion Convolutional Neural Network (TSFFCNN) is established. In-depth features are extracted from the proposed parallel multi-channel structure of 1D-CNN and 2D-CNN and then jointed by feature fusion strategy for a more reliable diagnostic effect. Besides, Particle Smarm Optimized-Support Vector Machine (PSO-SVM) is adopted for higher accuracy. Model's structural parameters are well-configured for fewer iterations and less computational cost. The algorithm's diagnostic effectiveness on the single and simulated compound fault is verified. Stationarity and synchronicity are conceptualized to prove the reliability. With accuracy, convergence iterations, and time consumption, the TSFFCNN-PSO-SVM model is comprehensively compared with other intelligent algorithms. The experimental results reveal that TSFFCNN-PSO-SVM can identify fault modes from vibration signals more accurately with fewer iterations at the same time.
A novel method for simultaneous-fault diagnosis based on between-class learning
2021, Measurement: Journal of the International Measurement Confederation
Condition monitoring and fault diagnosis are crucial to ensure the safety and efficiency of modern railway systems. The simultaneous fault may lead to catastrophic consequences and can be difficult to accurately detect when components are tightly coupled, which poses particular challenges to the automatic diagnosis. This paper proposes a novel method for simultaneous-fault diagnosis based on the combination of between-class learning and Bayesian deep learning. A modified between-class learning strategy with the multi-label approach is developed for model training. The detection results are obtained through an enhanced estimation method based on Bayesian deep learning, which can capture suspicious samples and identify simultaneous faults. The proposed method can distinguish simultaneous faults from regular faults and identify corresponding fault classes without using simultaneous-fault samples in the training phase. The experiments are conducted for the case of fault detection of high-speed trains, which demonstrates the accuracy and validity of the proposed method.
FMDBN: A first-order Markov dynamic Bayesian network classifier with continuous attributes
2020, Knowledge-Based Systems
Citation Excerpt :
Pérez et al. (2006, 2009) [10,11] developed the two classifiers proposed by John and Langley (1995) [9] by extending dependencies via adding edges between attributes. He et al. (2014) [12] presented a naïve Bayesian classifier based on Gaussian function, and both Luis et al. (2014) [13] and Wang et al. (2016) [14] put forward a complete Bayesian classifier and a Bayesian network classifier based on Gaussian kernel function to estimate the attribute density and applied them to spectral analysis, fault detection and root identification. Chen (2018) [15] proposed the kernel density estimation method to estimate the probability density function instead of learning the parameter as in the traditional Bayesian network classifiers and applied it to fault detection and root identification.
With the development of data driven decision making and prediction, time-series data are ubiquitous and the demand for its classification is vast. Although a large body of research has been reported in the literature, it is mainly oriented to situations in which class and attributes are changing simultaneously. In practice however, those class and attributes changes are not always synchronous. This means that further studies for asynchronous classifier problems are necessary. In this paper, a first-order Markov dynamic Bayesian network classifier is proposed to address the asynchronous issue, by combing time-series data preprocessing, time-delayed and dislocated transformation of variables, initial and evolutionary learning. The attribute density in this classifier is estimated based on Gaussian function, and the classification accuracy criterion for time-series progressiveness is also considered. This classifier has a relatively simple structure, which can avoid the problem of overfitting. In addition, data can effectively be classified by utilizing three kinds of classification information, namely time-delayed, non-time-delayed and mixed information in multivariate time-series datasets. The proposed method is also able to accumulate classification information via iterative evolution and thus improve the generalization of classifiers. Experiments were carried out by using standard time-series datasets from UCI, financial and macroeconomic domains. The experimental results show that the proposed first-order Markov dynamic Bayesian network classifier is more accurate in dealing with these dynamic classification problems.
Applications of machine learning to machine fault diagnosis: A review and roadmap
2020, Mechanical Systems and Signal Processing
Intelligent fault diagnosis (IFD) refers to applications of machine learning theories to machine fault diagnosis. This is a promising way to release the contribution from human labor and automatically recognize the health states of machines, thus it has attracted much attention in the last two or three decades. Although IFD has achieved a considerable number of successes, a review still leaves a blank space to systematically cover the development of IFD from the cradle to the bloom, and rarely provides potential guidelines for the future development. To bridge the gap, this article presents a review and roadmap to systematically cover the development of IFD following the progress of machine learning theories and offer a future perspective. In the past, traditional machine learning theories began to weak the contribution of human labor and brought the era of artificial intelligence to machine fault diagnosis. Over the recent years, the advent of deep learning theories has reformed IFD in further releasing the artificial assistance since the 2010s, which encourages to construct an end-to-end diagnosis procedure. It means to directly bridge the relationship between the increasingly-grown monitoring data and the health states of machines. In the future, transfer learning theories attempt to use the diagnosis knowledge from one or multiple diagnosis tasks to other related ones, which prospectively overcomes the obstacles in applications of IFD to engineering scenarios. Finally, the roadmap of IFD is pictured to show potential research trends when combined with the challenges in this field.

View all citing articles on Scopus

View full text

Bayesian classifiers based on probability density estimation and their applications to simultaneous fault diagnosis

Abstract

Introduction

Section snippets

A brief review on Bayesian classifiers

Bayesian model based on joint probability density estimation

Application to simultaneous faults diagnosis

Conclusion and future works

Acknowledgements

Expert Systems with Applications

Information Sciences

Information Sciences

Applied Soft Computing

Information Sciences

Expert Systems with Applications

Mechanical Systems and Signal Processing

Development and implementation of an ANN-based fault diagnosis scheme for generator winding protection

IEEE Transactions on Power Delivery

On the optimality of the simple Bayesian classifier under zero-one loss

Machine Learning

Multiple simultaneous fault diagnosis via hierarchical and single artificial neural networks

Scientia Iranica

A simple generalisation of the area under the ROC curve for multiple class classification problems

Machine Learning

A new approach of simultaneous faults diagnosis based on random sets and DSmT

Journal of Electronics (China)