Skip to main content

Example-Based Robust DB-Outlier Detection for High Dimensional Data

  • Conference paper
Database Systems for Advanced Applications (DASFAA 2008)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 4947))

Included in the following conference series:

  • 1012 Accesses

Abstract

This paper presents a method of outlier detection to identify exceptional objects that match user intentions in high dimensional datasets. Outlier detection is a crucial element of many applications like financial analysis and fraud detection. Scholars have made numerous investigations, but the results show that current methods fail to directly discover outliers from high dimensional datasets due to the curse of dimensionality. Beyond that, many algorithms require several decisive parameters to be predefined. Such vital parameters are considerably difficult to determine without identifying datasets beforehand. To address these problems, we take an Example-Based approach and examine behaviors of projections of the outlier examples in a dataset. An example-based approach is promising, since users are probably able to provide a few outlier examples to suggest what they want to detect. An important point is that the method should be robust, even if user-provided examples include noises or inconsistencies. Our proposed method is based on the notion of DB- (Distance-Based) Outliers. Experiments demonstrate that our proposed method is effective and efficient on both synthetic and real datasets and can tolerate noise examples.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Zhu, C., Kitagawa, H., Papadimitriou, S., Faloutsos, C.: OBE: Outlier By Example. In: Dai, H., Srikant, R., Zhang, C. (eds.) PAKDD 2004. LNCS (LNAI), vol. 3056, pp. 222–234. Springer, Heidelberg (2004)

    Google Scholar 

  2. Zhu, C., Kitagawa, H., Faloutsos, C.: Example-Based Outlier Detection for High Dimensional Datasets. IPSJ Transactions on Databases 46(SIG5), 120–129 (2005)

    Google Scholar 

  3. Knorr, E.M., Ng, R.T.: Algorithms for Mining Distance-Based Outliers in Large Datasets. In: Proc. VLDB, pp. 392–403 (1988)

    Google Scholar 

  4. http://www.ics.uci.edu/~mlearn/MLRepository.html

  5. Aggarwal, C.C., Yu, P.S.: Outlier Detection for High Dimensional Data. In: Proc. SIGMOD Conf., pp. 37–46 (2001)

    Google Scholar 

  6. Beyer, K., Goldstein, J., Ramakrishnan, R., Shaft, U.: When is Nearest Neighbors Meaningful? In: Proc. Int. Conf. Database Theory, pp. 217–235 (1999)

    Google Scholar 

  7. Breuning, M.M., Kriegel, H.P., Ng, R.T., Sander, J.: LOF: Identifying Density-Based Local Outliers. In: Proc. SIGMOD Conf., pp. 93–104 (2000)

    Google Scholar 

  8. Goldberg, D.E.: Genetic Algorithms in Search, Optimization and Machine Learning. Addison Wesley, Reading (1989)

    MATH  Google Scholar 

  9. Aggarwal, C.C., Yu, P.S.: An Effective and Efficient Algorithm for High-dimensional Outlier Detection. The VLDB Journal 14(2), 211–221 (2005)

    Article  Google Scholar 

  10. Li, Y., Kitagawa, H.: DB-Outlier Detection by Example in High Dimensional Datasets. In: Proc. Proc. 3rd IEEE International Workshop on Databases for Next-Generation Researchers (SWOD) (2007)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Jayant R. Haritsa Ramamohanarao Kotagiri Vikram Pudi

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Li, Y., Kitagawa, H. (2008). Example-Based Robust DB-Outlier Detection for High Dimensional Data. In: Haritsa, J.R., Kotagiri, R., Pudi, V. (eds) Database Systems for Advanced Applications. DASFAA 2008. Lecture Notes in Computer Science, vol 4947. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-78568-2_25

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-78568-2_25

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-78567-5

  • Online ISBN: 978-3-540-78568-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics