Abstract
In this paper, we will propose a novel outlier mining algorithm, called Grid-ODF, that takes into account both the local and global perspectives of outliers for effective detection. The notion ofOutlying Degree Factor(ODF), that reflects the factors of both the density and distance, is introduced to rank outliers. A grid structure partitioning the data space is employed to enable Grid-ODF to be implemented efficiently. Experimental results show that Grid-ODF outperforms existing outlier detection algorithms such as LOF and KNN-distance in terms of effectiveness and efficiency.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Agrawal, R., Gehrke, J., Gunopulos, D., Raghavan, P.: Automatic Subspace Clustering of High Dimensional Data Mining Application. In: SIGMOD 1999, Philadelphia, PA (1999)
Barnett, V., Lewis, T.: Outliers in Statistical Data, 3rd edn. John Wiley, Chichester (1994)
Breuning, M., Kriegel, H.-P., Ng, R., Sander, J.: LOF: Identifying Density-Based Local Outliers. In: SIGMOD 2000, Dallas, Texas (2000)
Ester, M., Kriegel, H.-P., Sander, J., Xu, X.: A Density-based Algorithm for Discovering Clusters in Large Spatial Databases with Noise. In: KDD 1996, Portland, Oregon (1996)
Han, J., Kamber, M.: Data Mining: Concepts and Techniques. Morgan Kaufman Publishers, San Francisco (2000)
Hawkins, D.: Identification of Outliers. Chapman and Hall, London (1980)
Hinneburg, A., Keim, D.A.: An Efficient Approach to Cluster in Large Multimedia Databases with Noise. In: KDD 1998, New York City, NY (1998)
Jin, W., Tung, A.K.H., Han, J.: Finding Top_n Local Outliers in Large Database. In: SIGKDD 2001, San Francisco, CA (2001)
Knorr, E.M., Ng, R.T.: Algorithms for Mining Distance-based Outliers in Large Dataset. In: VLDB 1998, New York, NY (1998)
Knorr, E.M., Ng, R.T.: Finding Intentional Knowledge of Distance-based Outliers. In: VLDB 1999, Edinburgh, Scotland (1999)
Ng, R., Han, J.: Efficient and Effective Clustering Methods for Spatial Data Mining. In: VLDB 1994, Santiago, Chile (1994)
Preparata, F., Shamos, M.: Computational Geometry: an Introduction. Springer, Heidelberg (1988)
Ramaswamy, S., Rastogi, R., Kyuseok, S.: Efficient Algorithms for Mining Outliers from Large Data Sets. In: SIGMOD 2000, Dallas, Texas (2000)
Ruts, I., Rousseeuw, P.: Computing Depth Contours of Bivariate Point Clouds. Computational Statistics and Data Analysis 23, 153–168 (1996)
Sheikholeslami, G., Chatterjee, S., Zhang, A.: WaveCluster: A Wavelet based Clustering Approach for Spatial Data in Very Large Database. VLDB Journal 8(3-4), 289–304 (1999)
Zhang, T., Ramakrishnan, R., Livny, M.: BIRCH: An Efficient Data Clustering Method for Very Large Databases. In: SIGMOD 1996, Montreal, Canada (1996)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Wang, W., Zhang, J., Wang, H. (2005). Grid-ODF: Detecting Outliers Effectively and Efficiently in Large Multi-dimensional Databases. In: Hao, Y., et al. Computational Intelligence and Security. CIS 2005. Lecture Notes in Computer Science(), vol 3801. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11596448_113
Download citation
DOI: https://doi.org/10.1007/11596448_113
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-30818-8
Online ISBN: 978-3-540-31599-5
eBook Packages: Computer ScienceComputer Science (R0)