Skip to main content

Outlier Detection Integrating Semantic Knowledge

  • Conference paper
  • First Online:
Book cover Advances in Web-Age Information Management (WAIM 2002)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2419))

Included in the following conference series:

Abstract

Existing proposals on outlier detection didn’t take the semantic knowledge of the dataset into consideration. They only tried to find outliers from dataset itself, which prevents finding more meaningful outliers. In this paper, we consider the problem of outlier detection integrating semantic knowledge. We introduce new definition for outlier: semantic outlier. A semantic outlier is a data point, which behaves differently with other data points in the same class. A measure for identifying the degree of each object being an outlier is presented, which is called semantic outlier factor (SOF). An efficient algorithm for mining semantic outliers based on SOF is also proposed. Experimental results show that meaningful and interesting outliers can be found with our method.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. E. M. Knorr, R. T. Ng: Algorithms for Mining Distance-Based Outliers in Large Datasets. Proc. 24th Int. Conf. on Very Large Database, New York, NY, 1998, pp. 392–403.

    Google Scholar 

  2. S. Ramaswamy, R. Rastogi, S. Kyuseok: Efficient Algorithms for Mining Outliers from Large Data Sets. Proc. ACM SIGMOD 2000 Int. Conf. on Management of Data, Dallas, Texas, 2000.

    Google Scholar 

  3. M. M. Breunig, H. P. Kriegel, R. T. Ng, J. Sander: LOF: Identifying Density-Based Local Outliers”. Proc. ACM SIGMOD 2000 Int. Conf. on Management of Data, Dallas, Texas, 2000.

    Google Scholar 

  4. C. Aggarwal, P. Yu: Outlier Detection for High Dimensional Data. Proc. of the 2001 ACM SIGMOD Int’ 1 Conf. Management of Data, pp. 37–46, Santa Barbara, CA, USA.

    Google Scholar 

  5. Z. He, S. Deng and X. Xu: Squeezer: An Efficient Algorithm for Clustering Categorical Data. Technical Report, HIT, 2001. http://202.118.239.67/tech/squeezer.pdf To appear in Journal of Computer Science and Technology.

  6. C. J. Merz, Murphy: UCI Repository of Machine Learning Databases. (http://www.ics.uci.edu/~mlearn/MLRRepository.html).

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2002 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

He, Z., Deng, S., Xu, X. (2002). Outlier Detection Integrating Semantic Knowledge. In: Meng, X., Su, J., Wang, Y. (eds) Advances in Web-Age Information Management. WAIM 2002. Lecture Notes in Computer Science, vol 2419. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45703-8_12

Download citation

  • DOI: https://doi.org/10.1007/3-540-45703-8_12

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-44045-1

  • Online ISBN: 978-3-540-45703-9

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics