Abstract
Existing studies in outlier detection mostly focus on detecting outliers in full feature space. But most algorithms tend to break down in high-dimensional feature spaces because classes of objects often exist in specific subspace of the original feature space. Therefore, subspace outlier detection has been recently defined. As a novel solution to tackle this problem, we propose here a local subspace based outlier detection technique, which uses different subspaces for different objects. Using this concept we adopt local density based outlier detection to cope with high-dimensional data. A broad experimental evaluation shows that this approach yields results of significantly better quality than existing algorithms.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Yue, D., Wu, X., Wang, Y., Li, Y., Chu, C.: A Review of Data Mining-Based Financial Fraud Detection Research. In: 2007 International Conference on Wireless Communications, Networking and Mobile Computing, Shanghai, P. R. China, pp. 5514–5517 (2007)
Zhang, J., Zulkernine, M.: Anomaly Based Network Intrusion Detection with Unsupervised Outlier Detection. In: 2006 IEEE International Conference on Communications, Istanbul, Turkey, pp. 2388–2393 (2006)
Podgorelec, V., Heri_ko, M., Rozman, I.: Improving Mining of Medical Data by Outliers Prediction. In: 18th IEEE International Symposium on Computer-Based Medical Systems, Ireland, pp. 91–96 (2005)
Näsi, J., Sorsa, A., Leiviskä, K.: Sensor Validation And Outlier Detection Using Fuzzy Limits. In: 44th IEEE Conference on Decision and Control, and the European Control Conference, Seville, Spain, pp. 7828–7833 (2005)
Hawkins, D.: Identification of Outliers. Chapman and Hall, London (1980)
Hodge, V., Austin, J.: A Survey of Outlier Detection Methodologies. Artificial Intelligence Review, 85–126 (2004)
Eskin, E.: Anomaly Detection over Noisy Data Using Learned Probability Distributions. In: 17th International Conference on Machine Learning, Stanford, CA, USA, pp. 255–262 (2000)
Yamanishi, K., Takeuchi, J.: Discovering Outlier Filtering Rules from Unlabeled Data-Combining a supervised Learner with an Unsupervised Learner. In: 7th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, pp. 389–394 (2001)
Knorr, E., Ng, R.: Algorithms for mining distance-based outliers in large datasets. In: 24th International Conference on Very Large Data Bases, San Francisco, CA, USA, pp. 392–403 (1998)
Breunig, M., Kriegel, H., Ng, R., Sander, J.: LOF: identifying density-based local outliers. In: SIGMOD 2000 International Conference on Management of Data, Dallas, Texas, USA, pp. 93–104 (2000)
Hinneburg, A., Aggarwal, C., Keim, D.: What is the Nearest Neighbor in High Dimensional Spaces. In: 26th International Conference on Very Large Databases, Cairo, Egypt, pp. 506–515 (2000)
Cao, H., Si, G., Zhu, W., Zhang, Y.: Enhancing Effectiveness of Density-based Outlier Mining. In: 2008 International Symposiums on Information Processing. Moscow, pp. 149–154 (2008)
Nguyen, M., Mark, L., Omiecinski, E.: Subspace Outlier Detection in Data with Mixture of Variances and Noise. Report Number GT-CS-08-11, Georgia Institute of Technology, Atlanta, GA 30332, USA (2008)
Newman, C., Merz, C.: UCI repository of machine learning databases (1998)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Agrawal, A. (2009). Local Subspace Based Outlier Detection. In: Ranka, S., et al. Contemporary Computing. IC3 2009. Communications in Computer and Information Science, vol 40. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-03547-0_15
Download citation
DOI: https://doi.org/10.1007/978-3-642-03547-0_15
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-03546-3
Online ISBN: 978-3-642-03547-0
eBook Packages: Computer ScienceComputer Science (R0)