Skip to main content

On Complementarity of Cluster and Outlier Detection Schemes

  • Conference paper
Book cover Data Warehousing and Knowledge Discovery (DaWaK 2003)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2737))

Included in the following conference series:

Abstract

We are interested in the problem of outlier detection, which is the discovery of data that deviate a lot from other data patterns. Hawkins [7] characterizes an outlier in a quite intuitive way as follows: An outlier is an observation that deviates so much from other observations as to arouse suspicion that it was generated by a different mechanism.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Ankerst, M., Breunig, M., Kriegel, H.P., Sander, J.: OPTICS: Ordering points to identify the cluster structure. In: Proc. of ACM-SIGMOD Conf., pp. 49–60 (1999)

    Google Scholar 

  2. Arning, A., Agrawal, R., Raghavan, P.: A Linear Method for Deviation detection in Large Databases. In: Proc. of 2nd Intl. Conf. On Knowledge Discovery and Data Mining, pp. 164–169 (1996)

    Google Scholar 

  3. Barnett, V., Lewis, T.: Outliers in Statistical Data. John Wiley, Chichester (1994)

    MATH  Google Scholar 

  4. Breuning, M., Kriegel, H.-P., Ng, R., Sander, J.: LOF: Identifying density-based Local Outliers. In: Proc. of the ACM SIGMOD Conf. (2000)

    Google Scholar 

  5. DuMouchel, W., Schonlau, M.: A Fast Computer Intrusion Detection Algorithm based on Hypothesis Testing of Command Transition Probabilities. In: Proc. of 4th Intl. Conf. On Knowledge Discovery and Data Mining, pp. 189–193 (1998)

    Google Scholar 

  6. Ester, M., Kriegel, H., Sander, J., Xu, X.: A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise. In: Proc. of 2nd Intl. Conf. On Knowledge Discovery and Data Mining, pp. 226–231 (1996)

    Google Scholar 

  7. Fawcett, T., Provost, F.: Adaptive Fraud Detection. Data Mining and Knowledge Discovery Journal 1(3), 291–316 (1997)

    Article  Google Scholar 

  8. Guha, S., Rastogi, R., Shim, K.: Cure: An Efficient Clustering Algorithm for Large Databases. In: Proc. of the ACM SIGMOD Conf., pp. 73–84 (1998)

    Google Scholar 

  9. Hawkins, D.: Identification of Outliers. Chapman and Hall, London (1980)

    MATH  Google Scholar 

  10. Knorr, E., Ng, R.: Algorithms for Mining Distance-based Outliers in Large Datasets. In: Proc. of 24th Intl. Conf. On VLDB, pp. 392–403 (1998)

    Google Scholar 

  11. Knorr, E., Ng, R.: Finding Intensional Knowledge of Distance-based Outliers. In: Proc. of 25th Intl. Conf. On VLDB, pp. 211–222 (1999)

    Google Scholar 

  12. Ng, R., Han, J.: Efficient and Effective Clustering Methods for Spatial Data Mining. In: Proc. of 20th Intl. Conf. On Very Large Data Bases, pp. 144–155 (1994)

    Google Scholar 

  13. Ramaswamy, S., Rastogi, R., Kyuseok, S.: Efficient Algorithms for Mining Outliers from Large Data Sets. In: Proc. of ACM SIGMOD Conf., pp. 427–438 (2000)

    Google Scholar 

  14. Roussopoulos, N., Kelley, S., Vincent, F.: Nearest Neighbor Queries. In: Proc. of ACM SIGMOD Conf., pp. 71–79 (1995)

    Google Scholar 

  15. Sheikholeslami, G., Chatterjee, S., Zhang, A.: WaveCluster: A multi-Resolution Clustering Approach for Very Large Spatial Databases. In: Proc. of 24th Intl. Conf. On Very Large Data Bases, pp. 428–439 (1998)

    Google Scholar 

  16. Tang, J., Chen, Z., Fu, A., Cheung, D.: A Robust Outlier Detection Scheme in Large Data Sets. In: PAKDD (2002)

    Google Scholar 

  17. Zhang, T., Ramakrishnan, R., Linvy, M.: BIRCH: An Efficient Data Clustering Method for Very Large Databases. In: Proc. of ACM SIGMOD Intl. Conf., pp. 103–114 (1996)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2003 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Chen, Z., Fu, A.WC., Tang, J. (2003). On Complementarity of Cluster and Outlier Detection Schemes. In: Kambayashi, Y., Mohania, M., Wöß, W. (eds) Data Warehousing and Knowledge Discovery. DaWaK 2003. Lecture Notes in Computer Science, vol 2737. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-45228-7_24

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-45228-7_24

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-40807-9

  • Online ISBN: 978-3-540-45228-7

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics