Abstract
We propose a simple and efficient method to detect exceptional data, which includes a novel end user explanation facility. After various designs, the best was based on an unsupervised learning schema, which uses an adaptation of the artificial neural network paradigm ART for the cluster task. In our method, the cluster that contains the smaller number of instances is considered as outlier data. The method provides an explanation to the end user about why this cluster is exceptional with regard to the data universe. The proposed method has been tested and compared successfully not only with well-known academic data, but also with a real and very large financial database that contains attributes with numerical and categorical values.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Tang, J., Chen, Z.: Capabilities of outlier detection schemes in large datasets, framework and methodologies. Knowledge and Information Systems 11(1), 45–84 (2006)
Caudil, S., Ayuso, M., Guillen, M.: Fraud detection using a multinomial logit model with missing information. The Journal of Risk and Insurance 72(4), 539–550 (2005)
Perner, P.: Concepts for novelty detection and handling base on case-based reasoning process scheme. In: Perner, P. (ed.) ICDM 2007. LNCS (LNAI), vol. 4597, pp. 21–33. Springer, Heidelberg (2007)
Waterman, D.: A guide to Expert Systems. Addison-Wesley, Reading (1986)
Carpenter, G., Grossberg, S.: Neural dynamics of category learning and recognition: Attention, memory consolidation and amnesia. In: Davis, J. (ed.) Brain structure, learning and memory. AAAS symposium series (1986)
Jurowski, C., Reich, A.Z.: An explanation and illustration of cluster analysis for identifying hospitality market segments. Journal of Hospitality & Tourism Research, 67–91 (2000)
Kirkos, E., Spathis, C., Manolopoulos, Y.: Data mining techniques for the detection of fraudulent financial statements. Expert Systems with Applications 32(4), 995–1003 (2007)
Ferreira, P., Alves, R.: Establishing Fraud Detection Patterns Based on Signatures. In: Perner, P. (ed.) ICDM 2006. LNCS, vol. 4065, pp. 526–538. Springer, Heidelberg (2006)
Chen, T., Lin, C.: A new binary support vector system for increasing detection rate of credit card fraud. International Journal of Pattern Recognition and Artificial Intelligence 20(2), 227–239 (2006)
Pandit, S., Chau, D., Wang, S., Faloutsos, C.: NetProbe: a fast and Scalable System for Fraud Detection in Online Auction Networks. In: Proceedings of the 16th International World Wide Web Conference Committee, Banff, Alberta, Canada, May, 2007, pp. 201–210 (2007)
Srivastava, A., Kundu, A., Sural, S., Majumdar: Credit Card Fraud Detection Using Hidden Markov Model. IEEE Transactions on dependable and secure computing 5(1), 37–48 (2008)
Fast, A., Friedland, L., Maier, M., Taylor, B., Jensen, D., Goldberg, H.G., Komoroske, J.: Relational data pre-processing techniques for improved securities fraud detection. In: 13th International Conference on Knowledge Discovery and Data Mining, San Jose, California, pp. 941–949 (2007)
Padmaja, T., Dhulipalla, N., Bapi, R.S., Krishna, P.R.: Unbalanced data classification using extreme outlier elimination and sampling techniques for fraud detection. In: 15th International Conference on Advanced Computing and Communications, pp. 511–516 (2007)
Reddy, C.K., Chiang, H., Rajaratnam, B.: Trust-tech-based Expectation maximization for learning finite mixture models. IEEE Transactions on Pattern Analysis and Machine Intelligence 30(7), 1146–1157 (2008)
Mitra, S., et al.: Data mining in soft computing framework: a survey. IEEE Trans. on neural networks 13(1), 3–14 (2002)
Blake, C., Merz, C.: UCI repository of Machine Learning databases, Univ. of California, Irvine (1998), http://www.ics.uci.edu/mlearn/MLRepository.html
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Mejía-Lavalle, M., Sánchez Vivar, A. (2009). Outlier Detection with Explanation Facility. In: Perner, P. (eds) Machine Learning and Data Mining in Pattern Recognition. MLDM 2009. Lecture Notes in Computer Science(), vol 5632. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-03070-3_34
Download citation
DOI: https://doi.org/10.1007/978-3-642-03070-3_34
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-03069-7
Online ISBN: 978-3-642-03070-3
eBook Packages: Computer ScienceComputer Science (R0)