Skip to main content
Log in

Imbalanced dataset-based echo state networks for anomaly detection

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

Anomaly detection is a very effective method to extract useful information from abundant data. Most existing anomaly detection methods are based on normal region or some specific algorithms, which ignore the fact that many actual datasets are mainly imbalanced, resulting in not function properly or effectively in practical, especially in the medical field. On the other hand, imbalanced dataset is also a frequently encountered problem in the learning of neural network because the lack of data in a minority class may lead to uneven classification accuracy. In this paper, inspired by these observations, a novel anomaly detection approach by using classical echo state network (ESN), a brain-inspired neural computing model, is presented. The entire dataset of the proposed method obeys an extremely imbalanced distribution, that is, anomalies are much rarer than normal data. And the training dataset has only the normal data. When the ESN is well trained, the parameters in ESN are the memory of normal data. If the normal data are added into the well-trained network, the error between the input data and the corresponding output is smaller compared with the error between abnormal input data and its corresponding output. Then anomaly behavior is detected if the error between the input data and the corresponding predictive value exceeds a certain threshold. Different from setting an invariable threshold arbitrarily for all of the data, the threshold value used in the proposed method is determined from the analysis of information theory and can be adjust adaptively according to different datasets. Experiments on abnormal heart rate detection are conducted to demonstrate and verify the effectiveness of the proposed detection algorithm and theory.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

References

  1. Pimentel MAF, Clifton DA, Lei C, Tarassenko L (2014) A review of novelty detection. Signal Process 99(6):215–249

    Article  Google Scholar 

  2. Chandola V, Banerjee A, Kumar V (2012) Anomaly detection for discrete sequences: a survey. IEEE Trans Knowl Data Eng 24(5):823–839

    Article  Google Scholar 

  3. Chandola V, Banerjee A, Kumar V (2009) Anomaly detection: a survey. ACM Comput Surv 41(3):1–58

    Article  Google Scholar 

  4. Markou M, Singh S (2003) Novelty detection: a review-part 1: statistical approaches. Signal Process 83(12):2481–2497

    Article  Google Scholar 

  5. Chen Z, Saligrama V (2012) Video anomaly detection based on local statistical aggregates. In: IEEE conference on computer vision and pattern recognition. IEEE Computer Society, pp 2112–2119

  6. Markou M, Singh S (2003) Novelty detection: a review-part 2: neural network based approaches. Signal Process 83(12):2499–2521

    Article  Google Scholar 

  7. Bontemps L, Cao VL, Mcdermott J, Le-Khac NA (2016) Collective anomaly detection based on long short-term memory recurrent neural networks. In: International conference on future data and security engineering. Springer, Cham, pp 141–152

  8. Ahmad S, Lavin A, Purdy S, Agha Z (2017) Unsupervised real-time anomaly detection for streaming data. Neurocomputing 262:134–147

    Article  Google Scholar 

  9. Zhao M, Tian Z, Chow TWS (2018) Fault diagnosis on wireless sensor network using the neighborhood kernel density estimation. Neural Comput Appl 15:1–12

    Article  Google Scholar 

  10. Lin WC, Ke SW, Tsai CF (2015) CANN: an intrusion detection system based on combining cluster centers and nearest neighbors. Knowl Based Syst 78(1):13–21

    Article  Google Scholar 

  11. Castillo O, Melin P, Ramrez E, Soria J (2012) Hybrid intelligent system for cardiac arrhythmia classification with fuzzy k-nearest neighbors and neural networks combined with a fuzzy system. Expert Syst Appl 39(3):2947–2955

    Article  Google Scholar 

  12. Zhao J, Liu K, Wang W, Liu Y (2014) Adaptive fuzzy clustering based anomaly data detection in energy system of steel industry. Inf Sci 259(3):335–345

    Article  Google Scholar 

  13. Kiss I, Genge B, Haller P, Sebestyn G (2014) Data clustering-based anomaly detection in industrial control systems. In: IEEE international conference on intelligent computer communication and processing, pp 275–281

  14. He H, Garcia EA (2009) Learning from imbalanced data. IEEE Trans Knowl Data Eng 21(9):1263–1284

    Article  Google Scholar 

  15. Yap BW, Rani KA, Rahman HAA, Fong S, Khairudin Z, Abdullah NN (2014) An application of oversampling, undersampling, bagging and boosting in handling imbalanced datasets. In: Proceedings of the first international conference on advanced data and information engineering (DaEng-2013). Springer, Singapore, pp 13–22

  16. Cateni S, Colla V, Vannucci M (2014) A method for resampling imbalanced datasets in binary classification tasks for real-world problems. Neurocomputing 135(8):32–41

    Article  Google Scholar 

  17. Cao H, Li XL, Woon YK, Ng SK (2013) Integrated oversampling for imbalanced time series classification. IEEE Trans Knowl Data Eng 25(12):2809–2822

    Article  Google Scholar 

  18. Garca S, Herrera F (2014) Evolutionary undersampling for classification with imbalanced datasets: proposals and taxonomy. Evol Comput 17(3):275–306

    Article  Google Scholar 

  19. Song Y, Morency L P, Davis R (2013) Distribution-sensitive learning for imbalanced datasets. In: IEEE international conference and workshops on automatic face and gesture recognition, pp 1–6

  20. Maldonado S, Weber R, Famili F (2014) Feature selection for high-dimensional class-imbalanced data sets using support vector machines. Inf Sci 286:228–246

    Article  Google Scholar 

  21. Wang X, Matwin S, Japkowicz N, Liu X (2013) Cost-sensitive boosting algorithms for imbalanced multi-instance datasets. In: Canadian conference on artificial intelligence. Springer, Berlin, vol 7884, pp 174–186

  22. Li Q, Yang B, Li Y, Deng N, Jing L (2013) Constructing support vector machine ensemble with segmentation for imbalanced datasets. Neural Comput Appl 22(1):249–256

    Article  Google Scholar 

  23. Jaeger H (2007) Echo state network. Scholarpedia 2(9):1479–1482

    Article  Google Scholar 

  24. Jaeger H (2002) Tutorial on training recurrent neural networks, covering BPTT, RTRL, EKF and echo state network approach. GMD Report 159. German National Research Center for Information Technology, Bremen, Germany

  25. Li D, Han M, Wang J (2012) Chaotic time series prediction based on a novel robust echo state network. IEEE Trans Neural Netw Learn Syst 23(5):787

    Article  Google Scholar 

  26. Boccato L, Attux R, Zuben FJV (2014) Self-organization and lateral interaction in echo state network reservoirs. Neurocomputing 138(11):297–309

    Article  Google Scholar 

  27. Hinaut X, Petit M, Pointeau G, Dominey PF (2014) Exploring the acquisition and production of grammatical constructions through human-robot interaction with echo state networks. Front Neurorobot 8:16

    Article  Google Scholar 

  28. Bianchi FM, Santis ED, Rizzi A, Sadeghian A (2015) Short-term electric load forecasting using echo state networks and PCA decomposition. IEEE Access 3:1931–1943

    Article  Google Scholar 

  29. Bianchi FM, Scardapane S, Uncini A, Rizzi A, Sadeghian A (2015) Prediction of telephone calls load using echo state network with exogenous variables. Neural Netw 71(C):204–213

    Article  Google Scholar 

  30. Salehi MR, Abiri E, Dehyadegari L (2013) An analytical approach to photonic reservoir computing a network of SOA’s for noisy speech recognition. Opt Commun 306(6):135–139

    Article  Google Scholar 

  31. Alalshekmubarak A, Smith LS (2014) A noise robust arabic speech recognition system based on the echo state network. J Acoust Soc Am 135(4):2195

    Article  Google Scholar 

  32. Buteneers P, Verstraeten D, Nieuwenhuyse BV, Stroobandt D, Raedt R, Vonck K (2013) Real-time detection of epileptic seizures in animal models using reservoir computing. Epilepsy Res 103(2–3):124–134

    Article  Google Scholar 

  33. Bozhkov L, Koprinkovahristova P, Georgieva P (2016) Learning to decode human emotions with echo state networks. Neural Netw 78:112–119

    Article  Google Scholar 

  34. Dickey DA (2013) The analysis of time series: an introduction. Technometrics 33(3):363–364

    Google Scholar 

  35. Haykin SS (2009) Neural networks and learning machines. China Machine Press, Beijing

    Google Scholar 

  36. Welch G, Bishop G (2001) An introduction to the Kalman filter. University of North Carolina at Chapel Hill, Chapel Hill

    Google Scholar 

  37. Lukoševičius Mantas, Jaeger Herbert (2009) Reservoir computing approaches to recurrent neural network training. Comput Sci Rev 3(3):127–149

    Article  Google Scholar 

  38. Lee S, Kim G, Kim S (2011) Self-adaptive and dynamic clustering for online anomaly detection. Expert Syst Appl 38(12):14891–14898

    Article  Google Scholar 

Download references

Acknowledgements

This work was supported in part by the National Natural Science Foundation of China (No. 61773081) and Technology Transformation Program of Chongqing Higher Education University (KJZH17102).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yongduan Song.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chen, Q., Zhang, A., Huang, T. et al. Imbalanced dataset-based echo state networks for anomaly detection. Neural Comput & Applic 32, 3685–3694 (2020). https://doi.org/10.1007/s00521-018-3747-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-018-3747-z

Keywords

Navigation