Abstract
In this paper, we suggest to exploit the framework of rough set for detecting outliers — individuals who behave in an unexpected way or feature abnormal properties. The ability to locate outliers can help to maintain knowledge base integrity and to single out irregular individuals. First, we formally define the notions of exceptional set and minimal exceptional set. We then analyze some special cases of exceptional set and minimal exceptional set. Finally, we introduce a new definition for outliers as well as the definition of exceptional degree. Through calculating the exceptional degree for each object in minimal exceptional sets, we can find out all outliers in a given dataset.
This work is supported by the National NSF of China (60273019 and 60073017), the National 973 Project of China (G1999032701), Ministry of Science and Technology (2001CCA03000) and the National Laboratory of Software Development Environment.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Pawlak, Z.: Rough sets. International Journal of Computer and Information Sciences 11, 341–356 (1982)
Pawlak, Z.: Rough sets: Theoretical Aspects of Reasoning about Data. Kluwer Academic Publishers, Dordrecht (1991)
Pawlak, Z., Grzymala-Busse, J.W., Slowinski, R., Ziarko, W.: Rough sets. Comm. ACM 38, 89–95 (1995)
Hawkins, D.: Identifications of Outliers. Chapman and Hall, London (1980)
Barnett, V., Lewis, T.: Outliers in Statistical Data. John Wiley & Sons, Chichester (1994)
Knorr, E., Ng, R.: A Unified Notion of Outliers: Properties and Computation. In: Proc. of the Int. Conf. on Knowledge Discovery and Data Mining, pp. 219–222 (1997)
Knorr, E., Ng, R.: Algorithms for Mining Distance-based Outliers in Large Datasets. In: VLDB Conference Proceedings (1998)
Knorr, E., Ng, R.: Finding intensional knowledge of distance-based outliers. In: Proc. of the 25th VLDB Conf. (1999)
Angiulli, F., Pizzuti, C.: Fast outlier detection in high dimensional spaces. In: Elomaa, T., Mannila, H., Toivonen, H. (eds.) PKDD 2002. LNCS (LNAI), vol. 2431, pp. 15–226. Springer, Heidelberg (2002)
Ramaswamy, S., Rastogi, R., Shim, K.: Efficient algorithms for mining outliers from large datasets. In: Proc. of the ACM SIGMOD Conf. (2000)
Knorr, E., Ng, R., Tucakov, V.: Distance-based outliers: algorithms and applications. VLDB Journal: Very Large Databases 8(3-4), 237–253 (2000)
Eskin, E., Arnold, A., Prerau, M., Portnoy, L., Stolfo, S.: A geometric framework for unsupervised anomaly detection: Detecting intrusions in unlabeled data. In: Data Mining for Security Applications (2002)
Lane, T., Brodley, C.E.: Temporal sequence learning and data reduction for anomaly detection. ACM Transactions on Information and System Security 2(3), 295–331 (1999)
Breunig, M.M., Kriegel, H.P., Ng, R.T., Sander, J.: LOF: Identifying density-based local outliers. In: Proc. ACM SIGMOD Conf., pp. 93–104 (2000)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Jiang, F., Sui, Y., Cao, C. (2005). Outlier Detection Using Rough Set Theory. In: Ślęzak, D., Yao, J., Peters, J.F., Ziarko, W., Hu, X. (eds) Rough Sets, Fuzzy Sets, Data Mining, and Granular Computing. RSFDGrC 2005. Lecture Notes in Computer Science(), vol 3642. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11548706_9
Download citation
DOI: https://doi.org/10.1007/11548706_9
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-28660-8
Online ISBN: 978-3-540-31824-8
eBook Packages: Computer ScienceComputer Science (R0)