On Complementarity of Cluster and Outlier Detection Schemes

Chen, Zhixiang; Fu, Ada Wai-Chee; Tang, Jian

doi:10.1007/978-3-540-45228-7_24

Zhixiang Chen⁷,
Ada Wai-Chee Fu⁸ &
Jian Tang⁸

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2737))

Included in the following conference series:

International Conference on Data Warehousing and Knowledge Discovery

470 Accesses
9 Citations

Abstract

We are interested in the problem of outlier detection, which is the discovery of data that deviate a lot from other data patterns. Hawkins [7] characterizes an outlier in a quite intuitive way as follows: An outlier is an observation that deviates so much from other observations as to arouse suspicion that it was generated by a different mechanism.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Ankerst, M., Breunig, M., Kriegel, H.P., Sander, J.: OPTICS: Ordering points to identify the cluster structure. In: Proc. of ACM-SIGMOD Conf., pp. 49–60 (1999)
Google Scholar
Arning, A., Agrawal, R., Raghavan, P.: A Linear Method for Deviation detection in Large Databases. In: Proc. of 2nd Intl. Conf. On Knowledge Discovery and Data Mining, pp. 164–169 (1996)
Google Scholar
Barnett, V., Lewis, T.: Outliers in Statistical Data. John Wiley, Chichester (1994)
MATH Google Scholar
Breuning, M., Kriegel, H.-P., Ng, R., Sander, J.: LOF: Identifying density-based Local Outliers. In: Proc. of the ACM SIGMOD Conf. (2000)
Google Scholar
DuMouchel, W., Schonlau, M.: A Fast Computer Intrusion Detection Algorithm based on Hypothesis Testing of Command Transition Probabilities. In: Proc. of 4th Intl. Conf. On Knowledge Discovery and Data Mining, pp. 189–193 (1998)
Google Scholar
Ester, M., Kriegel, H., Sander, J., Xu, X.: A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise. In: Proc. of 2nd Intl. Conf. On Knowledge Discovery and Data Mining, pp. 226–231 (1996)
Google Scholar
Fawcett, T., Provost, F.: Adaptive Fraud Detection. Data Mining and Knowledge Discovery Journal 1(3), 291–316 (1997)
Article Google Scholar
Guha, S., Rastogi, R., Shim, K.: Cure: An Efficient Clustering Algorithm for Large Databases. In: Proc. of the ACM SIGMOD Conf., pp. 73–84 (1998)
Google Scholar
Hawkins, D.: Identification of Outliers. Chapman and Hall, London (1980)
MATH Google Scholar
Knorr, E., Ng, R.: Algorithms for Mining Distance-based Outliers in Large Datasets. In: Proc. of 24th Intl. Conf. On VLDB, pp. 392–403 (1998)
Google Scholar
Knorr, E., Ng, R.: Finding Intensional Knowledge of Distance-based Outliers. In: Proc. of 25th Intl. Conf. On VLDB, pp. 211–222 (1999)
Google Scholar
Ng, R., Han, J.: Efficient and Effective Clustering Methods for Spatial Data Mining. In: Proc. of 20th Intl. Conf. On Very Large Data Bases, pp. 144–155 (1994)
Google Scholar
Ramaswamy, S., Rastogi, R., Kyuseok, S.: Efficient Algorithms for Mining Outliers from Large Data Sets. In: Proc. of ACM SIGMOD Conf., pp. 427–438 (2000)
Google Scholar
Roussopoulos, N., Kelley, S., Vincent, F.: Nearest Neighbor Queries. In: Proc. of ACM SIGMOD Conf., pp. 71–79 (1995)
Google Scholar
Sheikholeslami, G., Chatterjee, S., Zhang, A.: WaveCluster: A multi-Resolution Clustering Approach for Very Large Spatial Databases. In: Proc. of 24th Intl. Conf. On Very Large Data Bases, pp. 428–439 (1998)
Google Scholar
Tang, J., Chen, Z., Fu, A., Cheung, D.: A Robust Outlier Detection Scheme in Large Data Sets. In: PAKDD (2002)
Google Scholar
Zhang, T., Ramakrishnan, R., Linvy, M.: BIRCH: An Efficient Data Clustering Method for Very Large Databases. In: Proc. of ACM SIGMOD Intl. Conf., pp. 103–114 (1996)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, University of Texas-Pan American, Edinburg, TX, 78539, USA
Zhixiang Chen
Department of Computer Science and Engineering, Chinese University of Hong Kong, Shatin, N.T., Hong Kong
Ada Wai-Chee Fu & Jian Tang

Authors

Zhixiang Chen
View author publications
You can also search for this author in PubMed Google Scholar
Ada Wai-Chee Fu
View author publications
You can also search for this author in PubMed Google Scholar
Jian Tang
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Graduate School of Informatics, Kyoto University, Yoshida-Honmachi, Sakyo, 606-8501, Kyoto, Japan
Yahiko Kambayashi
I.B.M. India Research Lab, India
Mukesh Mohania
Institute for Application Oriented Knowledge Processing (FAW), Johannes Kepler University Linz, Austria
Wolfram Wöß

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chen, Z., Fu, A.WC., Tang, J. (2003). On Complementarity of Cluster and Outlier Detection Schemes. In: Kambayashi, Y., Mohania, M., Wöß, W. (eds) Data Warehousing and Knowledge Discovery. DaWaK 2003. Lecture Notes in Computer Science, vol 2737. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-45228-7_24

Download citation

DOI: https://doi.org/10.1007/978-3-540-45228-7_24
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-40807-9
Online ISBN: 978-3-540-45228-7
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics