Data-driven fault prediction and anomaly measurement for complex systems using support vector probability density estimation

doi:10.1016/j.engappai.2017.09.008

Engineering Applications of Artificial Intelligence

Volume 67, January 2018, Pages 1-13

https://doi.org/10.1016/j.engappai.2017.09.008 Get rights and content

Abstract

To quantitatively monitor the state of complex system, a data-driven fault prediction and anomaly degree measurement method based on probability density estimation is studied in this paper. First, an anomaly index is introduced and defined to measure the anomaly degree of samples. Then By improving the form of constraint condition, a single slack factor multiple kernel support vector machine probability density estimation model is presented. As a result, the scale of object function and the solution number are all reduced, and the computational efficiency of the presented model is greatly enhanced. On the other hand, as the introduction of multiple kernel functions, a multiple kernel matrix with better data mapping performance is obtained, which can well solve the composite probability density estimation for uncoupled data. The simulation test shows that the presented model has higher estimation precision and speed. The experiments on complex system fault prediction also show that the system’s anomaly degree can be quantitatively and accurately measured by the anomaly index gained from the prediction results, which can effectively improve the fault prediction precision and increase the prediction advances.

Introduction

With the increasing needs for system reliability, it is hoped that not only the fault’s detection and isolation can be provided when it occurs, but also the fault can be forecasted before it occurs. It also means that the fault can be discovered, be located and be eliminated in the early period, when it has notcaused serious damage to the whole system. In this way, enough time will be obtained to prevent the emerging of fault by taking necessary measures, which can avoid unnecessary loss and is important to system. Especially for the systems requiring high reliability, such as aerospace and nuclear energy, fault prediction has been a very important problem presented in recent years Zhou and Xu (2009), Dai and Gao (2013). In fault prediction field, the system commonly has the fault state and the failure state. The fault state means that an anomaly of the system index occurs, but the system can still in a normal working process. Correspondingly, the failure state means the system index exceeds some threshold, in this case, the system will cannot work.

Different systems have different demand levels for reliability, so it would be best if the anomaly index measuring the system’s anomaly degree can be calculated from algorithms. As for whether the fault should be predicted, the operator can decide according to the practical security requirement. In the domain of data-driven fault prediction, one of the methods can be used is probability density estimation to samples. On this basis, we can seek for an evaluation index characterizing the system’s anomaly degree can be found and utilized.

Probability density estimation from the observed dataset is a basic problem of machine learning. There are two types of probability density estimation methods at present, one is the parameter estimation, the other is the non-parameter estimation. Maximum likelihood method is one of the representative parameter estimation measures, but this method has some limitations, for example, it cannotbe used to estimate the probability density of the function compounded with several normal distributions. By contrast, the non-parameter estimation methods have been more widely used. The Parzen window density estimation Parzen (1962), Jenssen et al. (2006), Mohamed et al. (2004) is the most representative non-parameter estimation method, which is also a classical kernel density estimator. But the Parzen window method has a disadvantage that it does not have sparseness. When the probability densities of new samples are estimated, all the samples of the dataset are concerned and the computational complexity will become huge. Therefore, researchers have expected for a long time to find a probability density estimation method, which only uses some training samples having great influence on density estimation, instead of all the training samples. The essence of this method is to seek a sparse solution, so as to reduce the computation cost and improve the applicability. The support vector machine (SVM) provides a good approach for obtaining sparse solutions (Vapnik and Mukherjee, 2000), as the solution of SVM is only concerned with the support vectors in training samples. So the SVM method can be used to estimate probability density, and the operational steps are as follows: firstly, start from the definition of probability density and estimate an approximate distribution function from the empirical cumulative distribution function values. Secondly, get the density function by differential computing. In fact, the linear operator equation solutions are computed by SVM in the above-mentioned method, as a result, a sparse probability density estimation which is similar to the Parzen Window in form is obtained. By improving the form of constraint condition of SVM probability density estimation model, a single slack factor SVM probability density estimation model is presented in this paper. On this basis, the measurement of system’s anomaly degree is achieved.

In the remainder of this paper, we go along through different sections which are organized as follows: in Section 2, we summarize the data-driven fault prediction methods for complex systems, and introduce the quantitative measurement of system anomaly based on anomaly index. The principle of probability density model based on single slack factor SVM is introduced detailedly in Section 3, the corresponding algorithm’s complexity is also analyzed. In Section 4, several experiments are carried out to testify the effectiveness of the proposed method. Finally, a conclusion is drawn and the future work is also planned in Section 5.

Section snippets

Data-driven fault prediction for complex systems

In the operating process of some practical industry systems, the fault prediction and reliability evaluation technologies can be used to reduce the cost of system’s maintenance Wang et al. (2008), Ding et al. (2014), Alghazzawi and Lennox (2009). The technologies also can provide reliable evidence for system’s repairing opportunity determination, under this circumstance, the blindness of device maintenance can be reduced, and the effective time of system running can be greatly increased. Fault

SVM probability density estimation

The idea of SVM probability density estimation is to approximate distribution function $F (x) = \int_{- \infty}^{x} f (z) d z$ using support vector regression rather than estimate the probability density function directly. For an observed sample $x_{i}$ , the empirical distribution function $F_{l} (x_{i})$ can be constructed as Vapnik and Mukherjee (2000), Wang et al. (2008), Ding et al. (2014) $F_{l} (x_{i}) = \frac{1}{l} \sum_{j = 1}^{l} \prod_{k = 1}^{d} θ (x_{i, k} - x_{j, k}),$ where $θ (u)$ satisfies $θ (u) = \{\begin{matrix} 1, & u > 0 \\ 0, & u \leq 0 \end{matrix} .$

The relationship (Alghazzawi and Lennox, 2009) between

A simulation example and result analysis

Firstly, a complex system based on the Gaussian mixture model is introduced to evaluate the performance of the presented algorithm. Generate 100 random samples according to the distribution density (41) and let $μ_{1} = 0$ , $σ_{1} = 1$ , $μ_{2} = 6$ , $σ_{2} = 3$ . $p (x) = \frac{0.2}{σ_{1} \sqrt{2 π}} exp (- \frac{{(x - μ_{1})}^{2}}{2 σ_{1}^{2}}) + \frac{0.8}{σ_{2} \sqrt{2 π}} exp (- \frac{{(x - μ_{2})}^{2}}{2 σ_{2}^{2}}) .$ From (41), we can see that $x$ belongs to the two classes with the prior probabilities 0.2 and 0.8. Then estimate $p (x)$ according to the condition on uncoupled data with the known prior probability and the sample

Conclusions

The SVM probability density estimation method is discussed in this paper to evaluate the degree deviating from the normal running state of a system. But for some complex systems in practical applications, the accurate model of system is commonly unavailable, and we also do not know what distribution the probability density obeys. Under this circumstance, we can calculate and obtain an approximate estimation of the actual probability density through a regression of the collected samples, which

Acknowledgments

This work was jointly supported by the National Natural Science Foundation for Young Scientists of China (Grant No: 61202332, 61403397, 61503389), China Postdoctoral Science Foundation (Grant No: 2012M521905) and Natural Science Basic Research Plan in Shaanxi Province of China (Grant No: 2015JM6313, 2016JM6061).

References (28)

AlghazzawiA. et al.
Model predictive control monitoring using multivariate statistics
J. Process Control
(2009)
DingS.X. et al.
Data-driven realizations of kernel and image representations and their application to fault detection and control system design
Automatica
(2014)
HsuC. et al.
Intelligent ICA-SVM fault detector for non-Gaussian multivariateprocess monitoring
Expert Syst. Appl.
(2010)
IsermannR.
Model-based fault-detection and diagnosis-status and applications
Annual Reviews in Control
(2005)
JenssenR. et al.
The Cauchy–Schwarz divergence and parzen windowing: Connections to graph theory and mercer kernels
J. Franklin Inst.
(2006)
MahadevanS. et al.
Fault detection and diagnosis in process data using one-class support vector machines
J. Process Control
(2009)
SvärdC. et al.
Data-driven and adaptive statistical residual evaluation for fault detection with an automotive application
Mech. Syst. Signal Process.
(2014)
WangG. et al.
Data-driven fault diagnosis for an automobile suspension system by using a clustering based method
J. Franklin Inst.
(2014)
YuJ. et al.
Statistical MIMO controller performance monitoring, Part I: Data-driven covariance benchmark
J. Process Control
(2008)
ChiricoA.J. et al.
A data-driven methodology for fault detection in electromechanical actuators
Journal of Dynamic Systems, Measurement and Control
(2014)

DaewonL. et al.

Domain described support vector classifier for multi-classification problems

Pattern Recognit.

(2007)

DaiX.W. et al.

From model, signal to knowledge: A data-driven perspective of fault detection and diagnosis

IEEE Trans. on Industrial Informatics

(2013)

DashP.K. et al.

Fault classification and section identification of an advanced series compensated transmission line using support vector machine

IEEE Trans. Power Deliv.

(2007)

EI-KoujokM. et al.

Multiple sensor fault diagnosis by evolving data-driven approach

Information Science

(2014)

Cited by (18)

Utilization of measurements, machine learning, and analytical calculation for preventing belt flip over on conveyor belts
2023, Measurement: Journal of the International Measurement Confederation
The use of information technology in modern industry is becoming increasingly important to ensure the failure-free operation of devices. Innovatively, this article describes methods to predict belt flip over. The article describes the tested installation and the acquisition and processing of a unique data set gathered from hilly and very long belt conveyors. Values are predicted using novel analytical algorithms with a combination of machine learning algorithms. A simulation study is presented to demonstrate the effectiveness of the proposed method. The simulation results and case studies show that the proposed method generates accurate results and can be used to predict belt flip over for large-scale belt conveyors.
A semisupervised autoencoder-based approach for anomaly detection in high performance computing systems
2019, Engineering Applications of Artificial Intelligence
Citation Excerpt :
They are based on learning the normal behavior of the target system, via ML or statistical models; faults are then detected because they present a different signature w.r.t. the learnt one. The most common fall in one the following categories: probability density estimation (Yamanishi et al., 2004; Kristan et al., 2011; Li et al., 2016; Wang et al., 2018a), one-class Support Vector Machine (SVM) (Schölkopf et al., 2000; Heller et al., 2003), elliptical envelope (Pedregosa et al., 2011; Hoyle et al., 2015), Isolation Forest (Ting et al., 2008; Ding and Fei, 2013) and neighborhood identification (Kriegel et al., 2009; Tang et al., 2001). Since there is no clear technique outperforming all the others (Markou and Singh, 2003; Hodge and Austin, 2004), a subset of algorithms from the literature were implemented.
High Performance Computing (HPC) systems are complex machines with heterogeneous components that can break or malfunction. Automated anomaly detection in these systems is a challenging and critical task, as HPC systems are expected to work 24/7. The majority of the current state-of-the-art methods dealing with this problem are Machine Learning techniques or statistical models that rely on a supervised approach, namely the detection mechanism is trained to recognize a fixed number of different states (i.e. normal and anomalous conditions).
In this paper a novel semi-supervised approach for anomaly detection in supercomputers is proposed, based on a type of neural network called autoencoder. The approach learns the normal state of the supercomputer nodes and after the training phase can be used to discern anomalous conditions from normal behavior; in doing so it relies only on the availability of data characterizing only the normal state of the system. This is different from supervised methods that require data sets with many examples of anomalous states, which are in general very rare and/or hard to obtain.
The approach was tested on a real-life High Performance Computing system equipped with a monitoring infrastructure capable to generate large amount of data describing the system state. The proposed approach definitely outperforms the best current techniques for semi-supervised anomaly detection, with an increase in accuracy detection of around 12%. Two different implementations are discussed: one where each supercomputer node has a specific model and one with a single, generalized model for all nodes, in order to explore the trade-off between accuracy and ease of deployment.
One-class support vector machines with a bias constraint and its application in system reliability prediction
2019, Artificial Intelligence for Engineering Design, Analysis and Manufacturing: AIEDAM
Causality-Based PCA Methods for Condition Modeling of Mechatronic Systems
2024, IEEE Transactions on Industrial Informatics
Exponential Local Fisher Discriminant Analysis with Sparse Variables Selection: A Novel Fault Diagnosis Scheme for Industry Application
2023, Machines
Fault detection and prediction scheme for nonlinear stochastic distribution systems
2023, Asian Journal of Control

View all citing articles on Scopus

View full text

Data-driven fault prediction and anomaly measurement for complex systems using support vector probability density estimation

Abstract

Introduction

Section snippets

Data-driven fault prediction for complex systems

SVM probability density estimation

A simulation example and result analysis

Conclusions

Acknowledgments

J. Process Control

Automatica

Expert Syst. Appl.

Annual Reviews in Control

J. Franklin Inst.

J. Process Control

Mech. Syst. Signal Process.

J. Franklin Inst.

J. Process Control

A data-driven methodology for fault detection in electromechanical actuators

Journal of Dynamic Systems, Measurement and Control

Domain described support vector classifier for multi-classification problems

Pattern Recognit.

From model, signal to knowledge: A data-driven perspective of fault detection and diagnosis

IEEE Trans. on Industrial Informatics

Fault classification and section identification of an advanced series compensated transmission line using support vector machine

IEEE Trans. Power Deliv.

Multiple sensor fault diagnosis by evolving data-driven approach

Information Science