Outlier Detection Integrating Semantic Knowledge

He, Zengyou; Deng, Shengchun; Xu, Xiaofei

doi:10.1007/3-540-45703-8_12

Zengyou He⁶,
Shengchun Deng⁶ &
Xiaofei Xu⁶

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2419))

Included in the following conference series:

International Conference on Web-Age Information Management

381 Accesses
19 Citations

Abstract

Existing proposals on outlier detection didn’t take the semantic knowledge of the dataset into consideration. They only tried to find outliers from dataset itself, which prevents finding more meaningful outliers. In this paper, we consider the problem of outlier detection integrating semantic knowledge. We introduce new definition for outlier: semantic outlier. A semantic outlier is a data point, which behaves differently with other data points in the same class. A measure for identifying the degree of each object being an outlier is presented, which is called semantic outlier factor (SOF). An efficient algorithm for mining semantic outliers based on SOF is also proposed. Experimental results show that meaningful and interesting outliers can be found with our method.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

E. M. Knorr, R. T. Ng: Algorithms for Mining Distance-Based Outliers in Large Datasets. Proc. 24^th Int. Conf. on Very Large Database, New York, NY, 1998, pp. 392–403.
Google Scholar
S. Ramaswamy, R. Rastogi, S. Kyuseok: Efficient Algorithms for Mining Outliers from Large Data Sets. Proc. ACM SIGMOD 2000 Int. Conf. on Management of Data, Dallas, Texas, 2000.
Google Scholar
M. M. Breunig, H. P. Kriegel, R. T. Ng, J. Sander: LOF: Identifying Density-Based Local Outliers”. Proc. ACM SIGMOD 2000 Int. Conf. on Management of Data, Dallas, Texas, 2000.
Google Scholar
C. Aggarwal, P. Yu: Outlier Detection for High Dimensional Data. Proc. of the 2001 ACM SIGMOD Int’ 1 Conf. Management of Data, pp. 37–46, Santa Barbara, CA, USA.
Google Scholar
Z. He, S. Deng and X. Xu: Squeezer: An Efficient Algorithm for Clustering Categorical Data. Technical Report, HIT, 2001. http://202.118.239.67/tech/squeezer.pdf To appear in Journal of Computer Science and Technology.
C. J. Merz, Murphy: UCI Repository of Machine Learning Databases. (http://www.ics.uci.edu/~mlearn/MLRRepository.html).

Download references

Author information

Authors and Affiliations

Department of Computer Science and Engineering, Harbin Institute of Technology, Harbin, 150001, P. R. China
Zengyou He, Shengchun Deng & Xiaofei Xu

Authors

Zengyou He
View author publications
You can also search for this author in PubMed Google Scholar
Shengchun Deng
View author publications
You can also search for this author in PubMed Google Scholar
Xiaofei Xu
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Information School, Renmin University of China, Beijing, 100872, China
Xiaofeng Meng
Department of Computer Science, University of California, Santa Barbara, CA, 93106-5110, USA
Jianwen Su & Yujun Wang &

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

He, Z., Deng, S., Xu, X. (2002). Outlier Detection Integrating Semantic Knowledge. In: Meng, X., Su, J., Wang, Y. (eds) Advances in Web-Age Information Management. WAIM 2002. Lecture Notes in Computer Science, vol 2419. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45703-8_12

Download citation

DOI: https://doi.org/10.1007/3-540-45703-8_12
Published: 21 August 2002
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-44045-1
Online ISBN: 978-3-540-45703-9
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics