Skip to main content
Log in

A unifying methodology for the evaluation of neural network models on novelty detection tasks

  • Theoretical Advances
  • Published:
Pattern Analysis and Applications Aims and scope Submit manuscript

Abstract

An important issue in data analysis and pattern classification is the detection of anomalous observations and its influence on the classifier’s performance. In this paper, we introduce a novel methodology to systematically compare the performance of neural network (NN) methods applied to novelty detection problems. Initially, we describe the most common NN-based novelty detection techniques. Then we generalize to the supervised case, a recently proposed unsupervised novelty detection method for computing reliable decision thresholds. We illustrate how to use the proposed methodology to evaluate the performances of supervised and unsupervised NN-based novelty detectors on a real-world benchmarking data set, assessing their sensitivity to training parameters, such as data scaling, number of neurons, training epochs and size of the training set.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Notes

  1. By nonparametric we mean methods that make none or very few assumptions about the statistical distribution of the data.

  2. The percentile of a distribution of values is a number N α such that a percentage 100(1 − α) of the population values are less than or equal to N α. For example, the 75th percentile (also referred to as the 0.75 quantile) is a value (N α) such that 75% of the values of the variable fall below that value.

  3. In Box Plots, ranges or distribution characteristics of values of a selected variable (or variables) are plotted separately for groups of cases defined by values of a categorical (grouping) variable. The central tendency (e.g., median or mean), and range or variation statistics (e.g., quartiles, standard errors, or standard deviations) are computed for each group of cases and the selected values are presented in the selected box plot. Outlier data points can also be plotted.

  4. Median Interneuron Distance matrix is defined as that whose m ij entry is the median of the Euclidean distance between the weight vector w i and all neurons within its L-neighborhood.

  5. The Sammon’s mapping is a nonlinear mapping that maps a set of input vectors onto a plane trying to preserve the relative distance between the input vectors approximately. It is widely used to visualize the SOM ordering by mapping the values of weight vectors onto a plane. Sammon’s mapping can be applied directly to data sets, but it is computationally very intensive.

  6. We have tested different SOM topologies and number of neurons. Confirming a previous work on novelty detection (see reference [7]), the best results were obtained for 1D-SOMs, which have the additional advantage of being computational lighter than 2D-SOMs.

  7. By negative examples, we mean normal data vectors, whose original label was changed from normal (+1) to abnormal (−1) in order to simulate the class of negative examples. This class can be understood as the one containing novel (or abnormal) examples, also called outliers.

References

  1. Addison JFD, Wermter S, MacIntyre J (1999) Effectiveness of feature extraction in neural network architectures for novelty detection. In: Proceedings of the 9th international conference on artificial neural networks (ICANN’99). IEEE Press, Washington, DC, pp 976–981

  2. Albrecht S, Busch J, Kloppenburg M, Metze F, Tavan P (2000) Generalized radial basis functions networks for classification and novelty detection: self-organization of optimal bayesian decision. Neural Netw 13:1075–1093

    Article  Google Scholar 

  3. Alhoniemi E, Hollmén J, Simula O, Vesanto J (1999) Process monitoring and modeling using the self-organizing map. Integr Comput Aided Eng 6(1):3–14

    Google Scholar 

  4. Appiani E, Buslacchi G (2009) Computational intelligence solutions for homeland security. Adv Soft Comput 53:43–52

    Article  Google Scholar 

  5. Augusteijn MF, Folkert BA (2002) Neural network classification and novelty detection. Int J Remote Sens 23(14):2891–2902

    Article  Google Scholar 

  6. Barreto GA, Aguayo L (2009) Time series clustering for anomaly detection using competitive neural networks. In: Principe JC, Miikkulainen R (eds) Advances in self-organizing maps, vol LNCS-5629. Springer, Berlin, pp 28–36

  7. Barreto GA, Mota JCM, Souza LGM, Frota RA, Aguayo L (2005) Condition monitoring of 3G cellular networks through competitive neural models. IEEE Trans Neural Netw 16(5):1064–1075

    Article  Google Scholar 

  8. Blake CL, Merz CJ (1998) UCI repository of machine learning databases. University of California, Irvine, Dept. of Information and Computer Sciences. http://www.ics.uci.edu/∼mlearn/MLRepository.html

  9. Cristani M, Bicego M, Murino V (2007) Audio-visual event recognition in surveillance video sequences. IEEE Trans Multimed 9(2):257–266

    Article  Google Scholar 

  10. Dawson MRW, Schopflocher DP (1992) Modifying the generalized delta rule to train networks of nonmonotonic processors for pattern classification. Connect Sci 4(1):19–31

    Article  Google Scholar 

  11. DiCiccio TJ, Efron B (1996) Bootstrap confidence intervals. Stat Sci 11(3):189–228

    Article  MathSciNet  MATH  Google Scholar 

  12. Efron B, Tibshirani RJ (1993) An Introduction to the Bootstrap. Chapman & Hall, Boca Raton

  13. Fisch D, Hofmann A, Sick B (2010) On the versatility of radial basis function neural networks: a case study in the field of intrusion detection. Inform Sci 180(12):2421–2439

    Article  Google Scholar 

  14. Flexer A (2001) On the use of self-organizing maps for clustering and visualization. Intell Data Anal 5(5):373–384

    MATH  Google Scholar 

  15. Frota RA, Barreto GA, Mota JCM (2007) Anomaly detection in mobile communication networks using the self-organizing map. J Intell Fuzzy Syst 18(5):493–500

    MATH  Google Scholar 

  16. Gonzalez F, Dasgupta D (2002) Neuro-immune and self-organizing map approaches to anomaly detection: a comparison. In: Proceedings of the first international conference on artificial immune systems, Canterbury, UK, pp 203–211

  17. Harris T (1993) A Kohonen SOM based machine health monitoring system which enables diagnosis of faults not seen in the training set. In: Proceedings of the international joint conference on neural networks, (IJCNN’93), vol 1, pp 947–950

  18. Hinton GE, Salakhutdinov RR (2006) Reducing the dimensionality of data with neural networks. Science 313(5786):504–507

    Article  MathSciNet  MATH  Google Scholar 

  19. Hodge VJ, Austin J (2004) A survey of outlier detection methodologies. Artif Intell Rev 22:85–126

    Article  MATH  Google Scholar 

  20. Höglund AJ, Hätönen K, Sorvari AS (2000) A computer host-based user anomaly detection system using the self-organizing map. In: Proceedings of the IEEE-INNS-ENNS international joint conference on neural networks (IJCNN’00), vol 5, Como, Italy, pp 411–416

  21. Japkowicz N, Myers C, Gluck M (1995) A novelty detection approach to classification. In: Proceedings of the 14th international joint conference on artificial intelligence (IJCAI’95), pp 518–523

  22. Sammon JW Jr. (1969) A nonlinear mapping for data structure analysis. IEEE Trans Comput C-18:401–409

    Article  Google Scholar 

  23. King S, Bannister PR, Clifton DA, Tarassenko L (2009) Probabilistic approach to the condition monitoring of aerospace engines. Proc IMechE Pt G J Aerosp Eng 223(5):533–541

    Article  Google Scholar 

  24. Kohonen T (1989) Self-organization and associative memory, 3rd edn. Springer, Berlin

  25. Kohonen T (1990) The self-organizing map. Proc IEEE 78(9):1464–1480

    Article  Google Scholar 

  26. Kohonen T (2001) Self-organizing maps, 3rd edn. Springer, Berlin

  27. Kohonen T, Oja E (1976) Fast adaptive formation of orthogonalizing filters and associative memory in recurrent networks of neuron-like elements. Biol Cybernet 25:85–95

    Article  MathSciNet  Google Scholar 

  28. Laiho J, Kylväjä M, Höglund A (2002) Utilisation of advanced analysis methods in UMTS networks. In: Proceedings of the IEEE vehicular technology conference (VTS/spring), Birmingham, Alabama, pp 726–730

  29. Lawrence S, Burns I, Back AD, Tsoi AC, Giles CL (1998) Neural network classification and unequal prior class probabilities. In: Orr G, Müller K-R, Caruana R (eds) Neural networks: tricks of the trade, vol 1524. Lecture Notes in Computer Science, Springer, Berlin, pp 299–314

  30. Lee H-J, Cho S, Cho M-S (2008) Supporting diagnosis of attention-deficit hyperactive disorder with novelty detection. Artif Intell Med 42(3):199–212

    Article  MATH  Google Scholar 

  31. Li Y, Pont MJ, Jones NB (2002) Improving the performance of radial basis function classifiers in condition monitoring and fault diagnosis applications where ‘unknown’ faults may occur. Pattern Recogn Lett 23(5):569–577

    Article  MATH  Google Scholar 

  32. Markou M, Singh S (2003) Novelty detection: a review—part 1: statistical approaches. Signal Proc 83(12):2481–2497

    Article  MATH  Google Scholar 

  33. Markou M, Singh S (2003) Novelty detection: a review—part 2: neural network based approaches. Signal Proc 83(12):2499–2521

    Article  MATH  Google Scholar 

  34. Markou M, Singh S (2006) A neural network-based novelty detector for image sequence analysis. IEEE Trans Pattern Anal Mach Intell 28(10):1664–1677

    Article  Google Scholar 

  35. Marsland S (2003) Novelty detection in learning systems. Neural Comput Surv 3:157–195

    Google Scholar 

  36. Marsland S, Shapiro J, Nehmzow U (2002) A self-organising network that grows when required. Neural Netw 15(8–9):1041–1058

    Article  Google Scholar 

  37. Modenesi AP, Braga AP (2009) Analysis of time series novelty detection strategies for synthetic and real data. Neural Proc Lett 30(1):1–17

    Article  Google Scholar 

  38. Muñoz A, Muruzábal J (1998) Self-organising maps for outlier detection. Neurocomputing 18:33–60

    Article  Google Scholar 

  39. Petsche T, Marcantonio A, Darken C, Hanson SJ, Kuhn GM, Santoso I (1996) A neural network autoassociator for induction motor failure prediction. In: Touretzky D, Mozer M, Hasselmo M (eds) Advances in neural information processing systems, vol. 8. MIT Press, Cambridge, pp 924–930

  40. Piciarelli C, Micheloni C, Foresti GL (2008) Trajectory-based anomalous event detection. IEEE Trans Circuit Syst Video Technol 18(11):1544–1554

    Article  Google Scholar 

  41. Reich Y, Barai SV (1999) Evaluating machine learning models for engineering problems. Artif Intell Eng 13:257–272

    Article  Google Scholar 

  42. Rose CJ, Taylor CJ (2004) A generative statistical model of mammographic appearance. In: Rueckert D, Hajnal J, Yang G-Z (eds) Proceedings of the 2004 medical image understanding and analysis (MUIA’04), pp 89–92

  43. Scholkopf B, Williamson RC, Smola AJ, Shawe-Taylor J, Platt JC (2000) Support vector method for novelty detection. In: Solla SA, Leen TK, Müller K-R (eds) Advances in neural information processing systems, vol 12. MIT Press, Cambridge, pp 582–588

  44. Tanaka M, Sakawa M, Shiromaru I, Matsumoto T (1995) Application of Kohonen’s self-organizing network to the diagnosis system for rotating machinery. In: Proceedings of the IEEE international conference on systems, man and cybernetics (SMC’95), vol 5, pp 4039–4044

  45. Vasconcelos GC, Fairhurst MC, Bisset DL (1995) Investigating feedforward neural networks with respect to the rejection of spurious patterns. Pattern Recogn Lett 16:207–212

    Article  Google Scholar 

  46. Vesanto J, Ahola J (1999) Hunting for correlations in data using the self-organizing map. In: Proceedings of the international ICSC congress on computational intelligence methods and applications (CIMA99), pp 279–285

  47. Vieira Neto H, Nehmzow U (2007) Visual novelty detection with automatic scale selection. Robotics Autonomous Syst 55(9):693–701

    Article  Google Scholar 

  48. Vu D, Vemuri VR (2002) Computer network intrusion detection: A comparison of neural networks methods. J Differ Equ Dyn Syst

  49. Webb A (2002) Statistical pattern recognition, 2nd edn. Wiley, New York

  50. Wolberg WH, Mangasarian OL (1990) Multisurface method of pattern separation for medical diagnosis applied to breast cytology. Proc Natl Acad Sci USA 87:9193–9196

    Article  MATH  Google Scholar 

  51. Yamanishi K, Maruyama Y (2007) Dynamic model selection with its applications to novelty detection. IEEE Trans Inform Theory 53(6):2180–2189

    Article  MathSciNet  Google Scholar 

  52. Ypma A, Duin RPW (1997) Novelty detection using self-organising maps. In: Kasabov N, Kozma R, Ko K, O’Shea R, Goghill G, Gedeon T (eds) Progress in connectionist-based information systems, vol 2. Springer, Berlin, pp 1322–1325

  53. Zhang Z, Li J, Manikopoulos CN, Jorgenson J, Ucles J (2001) HIDE: a hierarchical network intrusion detection system using statistical preprocessing and neural network classification. In: Proceedings of the IEEE workshop on information assurance and security, pp 85–90

  54. Zhou J, Cheng L, Bischof WF (2007) Online learning with novelty detection in human-guided road tracking. IEEE Trans Geosci Remote Sens 45(12):3967–3977

    Article  Google Scholar 

Download references

Acknowledgements

The authors thank FUNCAP for supporting this research.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Guilherme A. Barreto.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Barreto, G.A., Frota, R.A. A unifying methodology for the evaluation of neural network models on novelty detection tasks. Pattern Anal Applic 16, 83–97 (2013). https://doi.org/10.1007/s10044-011-0265-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10044-011-0265-3

Keywords

Navigation