Skip to main content

Attribute Outlier Detection over Data Streams

  • Conference paper
Database Systems for Advanced Applications (DASFAA 2010)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 5982))

Included in the following conference series:

Abstract

Outlier detection is widely used in many data stream application, such as network intrusion detection, fraud detection, etc. However, most existing algorithms focused on detecting class outliers and there is little work on detecting attribute outliers, which considers the correlation or relevance among the data items. In this paper we study the problem of detecting attribute outliers within the sliding windows over data streams. An efficient algorithm is proposed to perform exact outlier detection. The algorithm relies on an efficient data structure, which stores only the necessary information and can perform updates incurred by data arrival and expiration with minimum cost. To address the problem of limited memory, we also present an approximate algorithm, which selectively drops data within the current window and at the same time maintains a maximum error bound. Extensive experiments are conducted and the results show that our algorithms are efficient and effective.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Aggarwal, C.C., Han, J., Wang, J., Yu, P.S.: A framework for clustering evolving data streams. In: VLDB 2003: Proceedings of the 29th international conference on Very large data bases, pp. 81–92. VLDB Endowment (2003)

    Google Scholar 

  2. Angiulli, F., Fassetti, F.: Detecting distance-based outliers in streams of data. In: CIKM 2007: Proceedings of the sixteenth ACM conference on Conference on information and knowledge management, pp. 811–820. ACM, New York (2007)

    Chapter  Google Scholar 

  3. Barnett, V., Lewis, T.: Outliers in statistical data (1984)

    Google Scholar 

  4. Breunig, M.M., Kriegel, H.-P., Ng, R.T., Sander, J.: Lof: identifying density-based local outliers. SIGMOD Rec. 29(2), 93–104 (2000)

    Article  Google Scholar 

  5. Cao, H., Zhou, Y., Shou, L., Chen, G.: Attribute outlier detection over data streams, 9 (2009), http://db.zju.edu.cn/wiki/index.php/Hui_Cao

  6. Hawkins, D.: Identification of outliers. Chapman and Hall, Reading (1980)

    MATH  Google Scholar 

  7. Guha, S., Meyerson, A., Mishra, N., Motwani, R., O’Callaghan, L.: Clustering data streams: Theory and practice. IEEE Trans. Knowl. Data Eng. 15(3), 515–528 (2003)

    Article  Google Scholar 

  8. Jiang, M.-F., Tseng, S.-S., Su, C.-M.: Two-phase clustering process for outliers detection. Pattern Recognition Letters 22(6/7), 691–700 (2001)

    Article  MATH  Google Scholar 

  9. Knorr, E.M., Ng, R.T.: A unified notion of outliers: Properties and computation. In: KDD, pp. 219–222 (1997)

    Google Scholar 

  10. Knorr, E.M., Ng, R.T.: Algorithms for mining distance-based outliers in large datasets. In: VLDB 1998: Proceedings of the 24th International Conference on Very Large Data Bases, pp. 392–403. Morgan Kaufmann Publishers Inc., San Francisco (1998)

    Google Scholar 

  11. Koh, J.L.Y., Lee, M.-L., Hsu, W., Ang, W.T.: Correlation-based attribute outlier detection in XML. In: ICDE 2008: Proceedings of the 24th International Conference on Data Engineering, pp. 1522–1524 (2008)

    Google Scholar 

  12. Koh, J.L.Y., Lee, M.-L., Hsu, W., Lam, K.-T.: Correlation-based detection of attribute outliers. In: Kotagiri, R., Radha Krishna, P., Mohania, M., Nantajeewarawat, E. (eds.) DASFAA 2007. LNCS, vol. 4443, pp. 164–175. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  13. Zhang, J., Gao, Q., Wang, H.: Spot: A system for detecting projected outliers from high-dimensional data streams. In: ICDE 2008: Proceedings of the 24th International Conference on Data Engineering, pp. 1628–1631. IEEE, Los Alamitos (2008)

    Chapter  Google Scholar 

  14. Zhang, T., Ramakrishnan, R., Livny, M.: Birch: an efficient data clustering method for very large databases. In: SIGMOD 1996: Proceedings of the 1996 ACM SIGMOD international conference on Management of data, pp. 103–114. ACM, New York (1996)

    Chapter  Google Scholar 

  15. Zhou, A., Cao, F., Qian, W., Jin, C.: Tracking clusters in evolving data streams over sliding windows. Knowl. Inf. Syst. 15(2), 181–214 (2008)

    Article  Google Scholar 

  16. Zhu, X., Wu, X.: Class noise vs. attribute noise: A quantitative study. Artif. Intell. Rev. 22(3), 177–210 (2004)

    Article  MATH  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Cao, H., Zhou, Y., Shou, L., Chen, G. (2010). Attribute Outlier Detection over Data Streams. In: Kitagawa, H., Ishikawa, Y., Li, Q., Watanabe, C. (eds) Database Systems for Advanced Applications. DASFAA 2010. Lecture Notes in Computer Science, vol 5982. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-12098-5_17

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-12098-5_17

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-12097-8

  • Online ISBN: 978-3-642-12098-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics