Abstract
An important issue in data analysis and pattern classification is the detection of anomalous observations and its influence on the classifier’s performance. In this paper, we introduce a novel methodology to systematically compare the performance of neural network (NN) methods applied to novelty detection problems. Initially, we describe the most common NN-based novelty detection techniques. Then we generalize to the supervised case, a recently proposed unsupervised novelty detection method for computing reliable decision thresholds. We illustrate how to use the proposed methodology to evaluate the performances of supervised and unsupervised NN-based novelty detectors on a real-world benchmarking data set, assessing their sensitivity to training parameters, such as data scaling, number of neurons, training epochs and size of the training set.
Similar content being viewed by others
Notes
By nonparametric we mean methods that make none or very few assumptions about the statistical distribution of the data.
The percentile of a distribution of values is a number N α such that a percentage 100(1 − α) of the population values are less than or equal to N α. For example, the 75th percentile (also referred to as the 0.75 quantile) is a value (N α) such that 75% of the values of the variable fall below that value.
In Box Plots, ranges or distribution characteristics of values of a selected variable (or variables) are plotted separately for groups of cases defined by values of a categorical (grouping) variable. The central tendency (e.g., median or mean), and range or variation statistics (e.g., quartiles, standard errors, or standard deviations) are computed for each group of cases and the selected values are presented in the selected box plot. Outlier data points can also be plotted.
Median Interneuron Distance matrix is defined as that whose m ij entry is the median of the Euclidean distance between the weight vector w i and all neurons within its L-neighborhood.
The Sammon’s mapping is a nonlinear mapping that maps a set of input vectors onto a plane trying to preserve the relative distance between the input vectors approximately. It is widely used to visualize the SOM ordering by mapping the values of weight vectors onto a plane. Sammon’s mapping can be applied directly to data sets, but it is computationally very intensive.
We have tested different SOM topologies and number of neurons. Confirming a previous work on novelty detection (see reference [7]), the best results were obtained for 1D-SOMs, which have the additional advantage of being computational lighter than 2D-SOMs.
By negative examples, we mean normal data vectors, whose original label was changed from normal (+1) to abnormal (−1) in order to simulate the class of negative examples. This class can be understood as the one containing novel (or abnormal) examples, also called outliers.
References
Addison JFD, Wermter S, MacIntyre J (1999) Effectiveness of feature extraction in neural network architectures for novelty detection. In: Proceedings of the 9th international conference on artificial neural networks (ICANN’99). IEEE Press, Washington, DC, pp 976–981
Albrecht S, Busch J, Kloppenburg M, Metze F, Tavan P (2000) Generalized radial basis functions networks for classification and novelty detection: self-organization of optimal bayesian decision. Neural Netw 13:1075–1093
Alhoniemi E, Hollmén J, Simula O, Vesanto J (1999) Process monitoring and modeling using the self-organizing map. Integr Comput Aided Eng 6(1):3–14
Appiani E, Buslacchi G (2009) Computational intelligence solutions for homeland security. Adv Soft Comput 53:43–52
Augusteijn MF, Folkert BA (2002) Neural network classification and novelty detection. Int J Remote Sens 23(14):2891–2902
Barreto GA, Aguayo L (2009) Time series clustering for anomaly detection using competitive neural networks. In: Principe JC, Miikkulainen R (eds) Advances in self-organizing maps, vol LNCS-5629. Springer, Berlin, pp 28–36
Barreto GA, Mota JCM, Souza LGM, Frota RA, Aguayo L (2005) Condition monitoring of 3G cellular networks through competitive neural models. IEEE Trans Neural Netw 16(5):1064–1075
Blake CL, Merz CJ (1998) UCI repository of machine learning databases. University of California, Irvine, Dept. of Information and Computer Sciences. http://www.ics.uci.edu/∼mlearn/MLRepository.html
Cristani M, Bicego M, Murino V (2007) Audio-visual event recognition in surveillance video sequences. IEEE Trans Multimed 9(2):257–266
Dawson MRW, Schopflocher DP (1992) Modifying the generalized delta rule to train networks of nonmonotonic processors for pattern classification. Connect Sci 4(1):19–31
DiCiccio TJ, Efron B (1996) Bootstrap confidence intervals. Stat Sci 11(3):189–228
Efron B, Tibshirani RJ (1993) An Introduction to the Bootstrap. Chapman & Hall, Boca Raton
Fisch D, Hofmann A, Sick B (2010) On the versatility of radial basis function neural networks: a case study in the field of intrusion detection. Inform Sci 180(12):2421–2439
Flexer A (2001) On the use of self-organizing maps for clustering and visualization. Intell Data Anal 5(5):373–384
Frota RA, Barreto GA, Mota JCM (2007) Anomaly detection in mobile communication networks using the self-organizing map. J Intell Fuzzy Syst 18(5):493–500
Gonzalez F, Dasgupta D (2002) Neuro-immune and self-organizing map approaches to anomaly detection: a comparison. In: Proceedings of the first international conference on artificial immune systems, Canterbury, UK, pp 203–211
Harris T (1993) A Kohonen SOM based machine health monitoring system which enables diagnosis of faults not seen in the training set. In: Proceedings of the international joint conference on neural networks, (IJCNN’93), vol 1, pp 947–950
Hinton GE, Salakhutdinov RR (2006) Reducing the dimensionality of data with neural networks. Science 313(5786):504–507
Hodge VJ, Austin J (2004) A survey of outlier detection methodologies. Artif Intell Rev 22:85–126
Höglund AJ, Hätönen K, Sorvari AS (2000) A computer host-based user anomaly detection system using the self-organizing map. In: Proceedings of the IEEE-INNS-ENNS international joint conference on neural networks (IJCNN’00), vol 5, Como, Italy, pp 411–416
Japkowicz N, Myers C, Gluck M (1995) A novelty detection approach to classification. In: Proceedings of the 14th international joint conference on artificial intelligence (IJCAI’95), pp 518–523
Sammon JW Jr. (1969) A nonlinear mapping for data structure analysis. IEEE Trans Comput C-18:401–409
King S, Bannister PR, Clifton DA, Tarassenko L (2009) Probabilistic approach to the condition monitoring of aerospace engines. Proc IMechE Pt G J Aerosp Eng 223(5):533–541
Kohonen T (1989) Self-organization and associative memory, 3rd edn. Springer, Berlin
Kohonen T (1990) The self-organizing map. Proc IEEE 78(9):1464–1480
Kohonen T (2001) Self-organizing maps, 3rd edn. Springer, Berlin
Kohonen T, Oja E (1976) Fast adaptive formation of orthogonalizing filters and associative memory in recurrent networks of neuron-like elements. Biol Cybernet 25:85–95
Laiho J, Kylväjä M, Höglund A (2002) Utilisation of advanced analysis methods in UMTS networks. In: Proceedings of the IEEE vehicular technology conference (VTS/spring), Birmingham, Alabama, pp 726–730
Lawrence S, Burns I, Back AD, Tsoi AC, Giles CL (1998) Neural network classification and unequal prior class probabilities. In: Orr G, Müller K-R, Caruana R (eds) Neural networks: tricks of the trade, vol 1524. Lecture Notes in Computer Science, Springer, Berlin, pp 299–314
Lee H-J, Cho S, Cho M-S (2008) Supporting diagnosis of attention-deficit hyperactive disorder with novelty detection. Artif Intell Med 42(3):199–212
Li Y, Pont MJ, Jones NB (2002) Improving the performance of radial basis function classifiers in condition monitoring and fault diagnosis applications where ‘unknown’ faults may occur. Pattern Recogn Lett 23(5):569–577
Markou M, Singh S (2003) Novelty detection: a review—part 1: statistical approaches. Signal Proc 83(12):2481–2497
Markou M, Singh S (2003) Novelty detection: a review—part 2: neural network based approaches. Signal Proc 83(12):2499–2521
Markou M, Singh S (2006) A neural network-based novelty detector for image sequence analysis. IEEE Trans Pattern Anal Mach Intell 28(10):1664–1677
Marsland S (2003) Novelty detection in learning systems. Neural Comput Surv 3:157–195
Marsland S, Shapiro J, Nehmzow U (2002) A self-organising network that grows when required. Neural Netw 15(8–9):1041–1058
Modenesi AP, Braga AP (2009) Analysis of time series novelty detection strategies for synthetic and real data. Neural Proc Lett 30(1):1–17
Muñoz A, Muruzábal J (1998) Self-organising maps for outlier detection. Neurocomputing 18:33–60
Petsche T, Marcantonio A, Darken C, Hanson SJ, Kuhn GM, Santoso I (1996) A neural network autoassociator for induction motor failure prediction. In: Touretzky D, Mozer M, Hasselmo M (eds) Advances in neural information processing systems, vol. 8. MIT Press, Cambridge, pp 924–930
Piciarelli C, Micheloni C, Foresti GL (2008) Trajectory-based anomalous event detection. IEEE Trans Circuit Syst Video Technol 18(11):1544–1554
Reich Y, Barai SV (1999) Evaluating machine learning models for engineering problems. Artif Intell Eng 13:257–272
Rose CJ, Taylor CJ (2004) A generative statistical model of mammographic appearance. In: Rueckert D, Hajnal J, Yang G-Z (eds) Proceedings of the 2004 medical image understanding and analysis (MUIA’04), pp 89–92
Scholkopf B, Williamson RC, Smola AJ, Shawe-Taylor J, Platt JC (2000) Support vector method for novelty detection. In: Solla SA, Leen TK, Müller K-R (eds) Advances in neural information processing systems, vol 12. MIT Press, Cambridge, pp 582–588
Tanaka M, Sakawa M, Shiromaru I, Matsumoto T (1995) Application of Kohonen’s self-organizing network to the diagnosis system for rotating machinery. In: Proceedings of the IEEE international conference on systems, man and cybernetics (SMC’95), vol 5, pp 4039–4044
Vasconcelos GC, Fairhurst MC, Bisset DL (1995) Investigating feedforward neural networks with respect to the rejection of spurious patterns. Pattern Recogn Lett 16:207–212
Vesanto J, Ahola J (1999) Hunting for correlations in data using the self-organizing map. In: Proceedings of the international ICSC congress on computational intelligence methods and applications (CIMA99), pp 279–285
Vieira Neto H, Nehmzow U (2007) Visual novelty detection with automatic scale selection. Robotics Autonomous Syst 55(9):693–701
Vu D, Vemuri VR (2002) Computer network intrusion detection: A comparison of neural networks methods. J Differ Equ Dyn Syst
Webb A (2002) Statistical pattern recognition, 2nd edn. Wiley, New York
Wolberg WH, Mangasarian OL (1990) Multisurface method of pattern separation for medical diagnosis applied to breast cytology. Proc Natl Acad Sci USA 87:9193–9196
Yamanishi K, Maruyama Y (2007) Dynamic model selection with its applications to novelty detection. IEEE Trans Inform Theory 53(6):2180–2189
Ypma A, Duin RPW (1997) Novelty detection using self-organising maps. In: Kasabov N, Kozma R, Ko K, O’Shea R, Goghill G, Gedeon T (eds) Progress in connectionist-based information systems, vol 2. Springer, Berlin, pp 1322–1325
Zhang Z, Li J, Manikopoulos CN, Jorgenson J, Ucles J (2001) HIDE: a hierarchical network intrusion detection system using statistical preprocessing and neural network classification. In: Proceedings of the IEEE workshop on information assurance and security, pp 85–90
Zhou J, Cheng L, Bischof WF (2007) Online learning with novelty detection in human-guided road tracking. IEEE Trans Geosci Remote Sens 45(12):3967–3977
Acknowledgements
The authors thank FUNCAP for supporting this research.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Barreto, G.A., Frota, R.A. A unifying methodology for the evaluation of neural network models on novelty detection tasks. Pattern Anal Applic 16, 83–97 (2013). https://doi.org/10.1007/s10044-011-0265-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10044-011-0265-3