Abstract
Outlier detection shows its increasingly high practical value in many application areas such as intrusion detection, fraud detection, discovery of criminal activities in electronic commerce and so on. Many techniques have been developed for outlier detection, including distribution-based outlier detection algorithm, depth-based outlier detection algorithm, distance-based outlier detection algorithm, density-based outlier detection algorithm and clustering-based outlier detection. Spectral clustering receives much attention as a competitive clustering algorithms emerging in recent years. However, it is not very well scalable to modern large datasets. To partially circumvent this drawback, in this paper, we propose a new outlier detection method inspired by spectral clustering. Our algorithm combines the concept of kNN and spectral clustering techniques to obtain the abnormal data as outliers by using the information of eigenvalues and eigenvectors statistically in the feature space. We compare the performance of our methods with distance-based outlier detection methods and density-based outlier detection methods. Experimental results show the effectiveness of our algorithm for identifying outliers.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Hawkins, D.M.: Identification of Outliers, Monographs on Applied Probability and Statistics. Chapman and Hall, London (1980)
Eskin, E., Arnold, A., Prerau, M., Portnoy, L., Stolfo, S.: A geometric framework for unsupervised anomaly detection: detecting intrusions in unlabeled data. In: Data Mining for Security Applications (2002)
Lane, T., Brodley, C.E.: Temporal sequence learning and data reduction for anomaly detection. ACM Transactions on Information and System Security 2(3), 295–331 (1999)
Sheng, B., Li, Q., Mao, W., Jin, W.: Outlier detection in sensor networks. In: Proceedings of ACM International Symposium on Mobile Ad Hoc Networking and Computing, pp. 219–228 (2007)
Hodge, V.J., Austin, J.: A Survey of Outlier Detection Methodologies. Artificial Intelligence Review 22, 85–126 (2004)
Chandola, V., Banerjee, A., Kumar, V.: Anomaly Detection: A Survey. ACM Computing Surveys 41(3), Article 15 (2009)
Knorr, E.M., Ng, R.T.: Algorithms for mining distance-based outliers in large datasets. In: Proceedings of the 24th VLDB Conference, New York, USA, pp. 392–403 (1998)
Breuning, M.M., Kriegel, H.P., Ng, R.T., Sander, J.: LOF: identifying density-based local outliers. In: Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data, pp. 93–104 (2000)
Jiang, M.F., Tseng, S.S., Su, C.M.: Two-Phase Clustering Process for Outliers Detection. Pattern Recognition Letters 22, 691–700 (2001)
Malik, J., Belongie, S., Leung, T., et al.: Contour and texture analysis for image segmentation. International Journal of Computer Vision 43(1), 7–27 (2001)
Bach, F.R., Jordan, M.I.: Blind one-microphone speech separation: a spectral learning approach. In: Proceedings of NIPS 2004, Vancouver, BC, pp. 65–72 (2004)
Weiss, Y.: Segmentation using eigenvectors: a unified view. In: International Conference on Computer Vision, Corfu, pp. 975–982 (1999)
Ding, C., He, X., Zha, H., et al.: A min-max cut algorithm for graph partitioning and data clustering. In: Proceedings of International Conference on Data Mining, California, pp. 107–114 (2001)
Shi, J., Malik, J.: Normalized Cuts and Image Segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence 22(8), 888–905 (2000)
Stoer, M., Wagner, F.: A simple min-cut algorithm. Journal of the ACM 44(4), 585–591 (1997)
Hagen, L., Kahng, A.: New spectral methods for ratio cut partitioning and clustering. IEEE Trans. Computer-Aided Design 11(9), 1074–1085 (1992)
Zelnik-Manor, L., Perona, P.: Self-tuning spectral clustering. In: Proceedings of NIPS 2004, Vancouver, BC, pp. 1601–1608 (2004)
Knorr, E.M., Ng, R.T., Tucakov, V.: Distance-based outliers: algorithms and applications. VLDB Journal: Very Large Databases 8(3–4), 237–253 (2000)
Ramaswamy, S., Rastogi, R., Shim, K.: Efficient algorithms for mining outliers from large data sets. In: Proceedings of the ACM SIGMOD Conference, pp. 427–438 (2000)
Angiulli, F., Pizzuti, C.: Fast outlier detection in high dimensional spaces. In: Elomaa, T., Mannila, H., Toivonen, H. (eds.) PKDD 2002. LNCS (LNAI), vol. 2431, pp. 15–26. Springer, Heidelberg (2002)
UCI: The UCI KDD Archive. University of California, Irvine, CA. http://kdd.ics.uci.edu/
Aggarwal, C., Yu, P.: Outlier detection for high-dimensional data. In: Proceedings of SIGMOD 2001, Santa Barbara, CA, USA, pp. 37–46 (2001)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Wang, Y., Wang, X., Wang, X.L. (2016). A Spectral Clustering Based Outlier Detection Technique. In: Perner, P. (eds) Machine Learning and Data Mining in Pattern Recognition. MLDM 2016. Lecture Notes in Computer Science(), vol 9729. Springer, Cham. https://doi.org/10.1007/978-3-319-41920-6_2
Download citation
DOI: https://doi.org/10.1007/978-3-319-41920-6_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-41919-0
Online ISBN: 978-3-319-41920-6
eBook Packages: Computer ScienceComputer Science (R0)