Skip to main content

ODDC: Outlier Detection Using Distance Distribution Clustering

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4819))

Abstract

Outlier detection is an important issue in many industrial and financial applications. Most outlier detection methods suffer from two problems: First, they need parameter tuning in accord to domain knowledge. Second, they are incapable to scale up to high dimensional space. In this paper, we propose a distance-based outlier definition and a detection algorithm ODDC (Distribution Clustering Outlier Detection). We redefine the problem by clustering in the distribution difference space rather than the original feature space. As a result, the new algorithm is stable regardless of different input and scalable to the dimensionality. Experiments on both synthetic and real datasets show that ODDC outperforms the counterpart both in effectiveness and efficiency.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Abe, N., Zadrozny, B., Langford, J.: Outlier Detection by Active Learning. In: Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Philadelphia, USA (2006)

    Google Scholar 

  2. Han, J., Kamber, M.: Data Mining: Concepts and Techniques, 2nd edn. Morgan Kaufmann Publishers, San Francisco (2006)

    Google Scholar 

  3. Hautamäki, V., Kärkkäinen, I., Fränti, P.: Outlier Detection using k-Nearest Neighbour Graph. In: Proceedings of the 17th International Conference on Pattern Recognition, Cambridge, United Kingdom (2004)

    Google Scholar 

  4. Angiulli, F., Basta, S., Pizzuti, C.: Distance-Based Detection and Prediction of Outliers. IEEE Transactions on Knowledge and Data Engineering 18(2) (2006)

    Google Scholar 

  5. Angiulli, F., Pizzuti, C.: Outlier Mining in Large High-Dimensional Data Sets. IEEE Transaction on Knowledge and Data Engineering 2(17), 203–215 (2005)

    Article  Google Scholar 

  6. Ren, D.M., Rahal, I., Perrizo, W., et al.: A Vertical Distance-based Outlier Detection Method with Local Pruning. In: Proceedings of the 13th Conference on Information and Knowledge Management, Washington, D.C., USA (2004)

    Google Scholar 

  7. Knorr, E.M., Ng, R.T.: A Unified Notion of Outliers: Properties and Computation. In: Proceedings of the 3rd International Conference on Knowledge Discovery and Data Mining Proceedings, pp. 219–222 (1997)

    Google Scholar 

  8. Breunig, M.M., Kriegel, H., Ng, R.T., et al.: LOF: Identifying Density-Based Local Outliers. In: Proceedings of ACM SIGMOD International Conference on Management of Data, Dalles, TX, pp. 93–104 (2000)

    Google Scholar 

  9. Provost, F., Fawcett, T.: Robust Classification for Imprecise Environments. Machine Learning 42, 203–231 (2001)

    Article  MATH  Google Scholar 

  10. Lazarevic, A., Kumar, V.: Feature Bagging for Outlier Detection. In: Proceedings of the ACM SIGMOD International Conference on Knowledge Discovery and Data Mining, Chicago, USA (2005)

    Google Scholar 

  11. Merz, C., Murphy, P., Aha, D.: UCI Repository of Machine Learning Databases. At, http://www.ics.uci.edu/~mlearn/MLRepository.html

  12. Zhu, C., Kitagawa, H., Papadimitriou, S., et al.: OBE: Outlier by Example. In: Proceedings of the 8th Pacific-Asia Conference on Knowledge Discovery and Data Mining, Sydney, Australia, pp. 222–234 (2004)

    Google Scholar 

  13. Johnson, T., Kwok, I., Ng, R.T.: Fast computation of 2-dimensional depth contours. In: Proceedings of the 4th International Conference on Knowledge Discovery and Data Mining, New York, USA, pp. 224–228 (1998)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Takashi Washio Zhi-Hua Zhou Joshua Zhexue Huang Xiaohua Hu Jinyan Li Chao Xie Jieyue He Deqing Zou Kuan-Ching Li Mário M. Freire

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Niu, K., Huang, C., Zhang, S., Chen, J. (2007). ODDC: Outlier Detection Using Distance Distribution Clustering. In: Washio, T., et al. Emerging Technologies in Knowledge Discovery and Data Mining. PAKDD 2007. Lecture Notes in Computer Science(), vol 4819. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-77018-3_34

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-77018-3_34

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-77016-9

  • Online ISBN: 978-3-540-77018-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics