Abstract
With a huge amount of data generated every second, it has become important to remove data anomalies. Outliers are the extreme value that deviates from other observations in data. We propose a novel outlier detection method; FCBODM (Fuzzy Constraint based Outlier Detection Method) that takes into account of fuzzy constraint and background knowledge to discover the outliers in a dataset. Our key idea is to use fuzzy constraint technology wherein we used nearness measure theory in fuzzy mathematics for finding similarities between data objects and background information. It helps in finding more meaningful outliers. Our novel approach can be integrated with traditional outlier detection methods to improve the outlier ranking. In order to validate and demonstrate the effectiveness and scalability of our method we experimented it on real and semantically meaningful datasets.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Aggarwal, C.C., Yu, P.: Finding generalized projected clusters in high dimensional spaces. In: Proceedings of ACM SIGMOD, pp. 70–81 (2000)
Arning, A., Agrawal, R., Raghavan, P.: A linear method for deviation detection in large databases. In: Proceedings of KDD, pp. 164–169 (1996)
Knorr, E.M., Ng, R.T.: Algorithms for mining distance-based outliers in large datasets. In: Proceedings of VLDB (1998)
Ramaswamy, S., Rastogi, R., Shim, K.: Efficient algorithms for mining outliers from large data sets. In: Proceedings of SIGMOD, pp. 427–438 (2000)
Kriegel, H.P., Schubert, M., Zimek, A.: Angle-based outlier detection in high-dimensional data. In: Proceedings of KDD, pp. 444–452 (2008)
Jin, W., Tung, A., Han, J.: Mining top-n local outliers in large databases. In: Proceedings of KDD, pp. 293–298 (2001)
Breunig, M.M., Kriegel, H.P., Ng, R.T., Sander, J.: LOF: identifying density-based local outliers. In: ACM SIGMOD Record, vol. 29, no. 2, pp. 93–104. ACM (2000)
Liu, B., Xiao, Y., Yu, P.S., Hao, Z., Cao, L.: An efficient approach for outlier detection with imperfect data labels. IEEE Trans. Knowl. Data Eng. 26(7), 1602–1616 (2014)
Schubert, E., Zimek, A., Kriegel, H.P.: Generalized outlier detection with flexible kernel density estimates. In: Proceedings of the 14th SIAM International Conference on Data Mining (SDM), Philadelphia, PA, pp. 542–550 (2014)
Huang, J., Zhu, Q., Yang, L., Feng, J.: A non-parameter outlier detection algorithm based on natural neighbor. Knowl.-Based Syst. 92, 71–77 (2016)
Eskin, E.: Anomaly detection over noisy data using learned probability distributions. In: International Conference on Machine Learning, pp. 255–262 (2000)
Chen, F., Lu, C.T., Boedihardjo, A.P.: GLS-SOD: a generalized local statistical approach for spatial outlier detection. In: ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2010)
Aggarwal, C.C., Philip, S.Y.: An effective and efficient algorithm for high-dimensional outlier detection. Int. J. Very Large Data Bases 14(2), 211–221 (2005)
Kriegel, H.P., Kröger, P., Schubert, E., Zimek, A.: Outlier detection in arbitrarily oriented subspaces. In: IEEE International Conference on Data Mining, pp. 379–388 (2012)
Müller, E., Schiffer, M., Seidl, T.: Statistical selection of relevant subspace projections for outlier ranking. In: 2011 IEEE 27th International Conference on Data Engineering (ICDE), pp. 434–445. IEEE (2011)
Campos, G.O., et al.: On the evaluation of unsupervised outlier detection: measures, datasets, and an empirical study. Data Mining Knowl. Discov. 30(4), 891–927 (2016)
Zadeh, L.: Fuzzy sets. Inf. Control 8(3), 338–353 (1965)
Wu, D., Mendel, J.M.: A vector similarity measure for linguistic approximation: interval type-2 and type-1 fuzzy sets. Inf. Sci. 178(2), 381–402 (2008)
Wu, D., Mendel, J.M.: A comparative study of ranking methods, similarity measures and uncertainty measures for interval type-2 fuzzy sets. Inf. Sci. 179(8), 1169–1192 (2009)
UCI machine learning repository. http://archive.ics.uci.edu/ml
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Sharma, V., Nagpal, A., Tripathy, B. (2019). A Fuzzy Constraint Based Outlier Detection Method. In: Huang, DS., Huang, ZK., Hussain, A. (eds) Intelligent Computing Methodologies. ICIC 2019. Lecture Notes in Computer Science(), vol 11645. Springer, Cham. https://doi.org/10.1007/978-3-030-26766-7_47
Download citation
DOI: https://doi.org/10.1007/978-3-030-26766-7_47
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-26765-0
Online ISBN: 978-3-030-26766-7
eBook Packages: Computer ScienceComputer Science (R0)