Skip to main content

A Fuzzy Constraint Based Outlier Detection Method

  • Conference paper
  • First Online:
Intelligent Computing Methodologies (ICIC 2019)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11645))

Included in the following conference series:

Abstract

With a huge amount of data generated every second, it has become important to remove data anomalies. Outliers are the extreme value that deviates from other observations in data. We propose a novel outlier detection method; FCBODM (Fuzzy Constraint based Outlier Detection Method) that takes into account of fuzzy constraint and background knowledge to discover the outliers in a dataset. Our key idea is to use fuzzy constraint technology wherein we used nearness measure theory in fuzzy mathematics for finding similarities between data objects and background information. It helps in finding more meaningful outliers. Our novel approach can be integrated with traditional outlier detection methods to improve the outlier ranking. In order to validate and demonstrate the effectiveness and scalability of our method we experimented it on real and semantically meaningful datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Aggarwal, C.C., Yu, P.: Finding generalized projected clusters in high dimensional spaces. In: Proceedings of ACM SIGMOD, pp. 70–81 (2000)

    Article  Google Scholar 

  2. Arning, A., Agrawal, R., Raghavan, P.: A linear method for deviation detection in large databases. In: Proceedings of KDD, pp. 164–169 (1996)

    Google Scholar 

  3. Knorr, E.M., Ng, R.T.: Algorithms for mining distance-based outliers in large datasets. In: Proceedings of VLDB (1998)

    Google Scholar 

  4. Ramaswamy, S., Rastogi, R., Shim, K.: Efficient algorithms for mining outliers from large data sets. In: Proceedings of SIGMOD, pp. 427–438 (2000)

    Article  Google Scholar 

  5. Kriegel, H.P., Schubert, M., Zimek, A.: Angle-based outlier detection in high-dimensional data. In: Proceedings of KDD, pp. 444–452 (2008)

    Google Scholar 

  6. Jin, W., Tung, A., Han, J.: Mining top-n local outliers in large databases. In: Proceedings of KDD, pp. 293–298 (2001)

    Google Scholar 

  7. Breunig, M.M., Kriegel, H.P., Ng, R.T., Sander, J.: LOF: identifying density-based local outliers. In: ACM SIGMOD Record, vol. 29, no. 2, pp. 93–104. ACM (2000)

    Google Scholar 

  8. Liu, B., Xiao, Y., Yu, P.S., Hao, Z., Cao, L.: An efficient approach for outlier detection with imperfect data labels. IEEE Trans. Knowl. Data Eng. 26(7), 1602–1616 (2014)

    Article  Google Scholar 

  9. Schubert, E., Zimek, A., Kriegel, H.P.: Generalized outlier detection with flexible kernel density estimates. In: Proceedings of the 14th SIAM International Conference on Data Mining (SDM), Philadelphia, PA, pp. 542–550 (2014)

    Google Scholar 

  10. Huang, J., Zhu, Q., Yang, L., Feng, J.: A non-parameter outlier detection algorithm based on natural neighbor. Knowl.-Based Syst. 92, 71–77 (2016)

    Article  Google Scholar 

  11. Eskin, E.: Anomaly detection over noisy data using learned probability distributions. In: International Conference on Machine Learning, pp. 255–262 (2000)

    Google Scholar 

  12. Chen, F., Lu, C.T., Boedihardjo, A.P.: GLS-SOD: a generalized local statistical approach for spatial outlier detection. In: ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2010)

    Google Scholar 

  13. Aggarwal, C.C., Philip, S.Y.: An effective and efficient algorithm for high-dimensional outlier detection. Int. J. Very Large Data Bases 14(2), 211–221 (2005)

    Article  Google Scholar 

  14. Kriegel, H.P., Kröger, P., Schubert, E., Zimek, A.: Outlier detection in arbitrarily oriented subspaces. In: IEEE International Conference on Data Mining, pp. 379–388 (2012)

    Google Scholar 

  15. Müller, E., Schiffer, M., Seidl, T.: Statistical selection of relevant subspace projections for outlier ranking. In: 2011 IEEE 27th International Conference on Data Engineering (ICDE), pp. 434–445. IEEE (2011)

    Google Scholar 

  16. Campos, G.O., et al.: On the evaluation of unsupervised outlier detection: measures, datasets, and an empirical study. Data Mining Knowl. Discov. 30(4), 891–927 (2016)

    Article  MathSciNet  Google Scholar 

  17. Zadeh, L.: Fuzzy sets. Inf. Control 8(3), 338–353 (1965)

    Article  Google Scholar 

  18. Wu, D., Mendel, J.M.: A vector similarity measure for linguistic approximation: interval type-2 and type-1 fuzzy sets. Inf. Sci. 178(2), 381–402 (2008)

    Article  MathSciNet  Google Scholar 

  19. Wu, D., Mendel, J.M.: A comparative study of ranking methods, similarity measures and uncertainty measures for interval type-2 fuzzy sets. Inf. Sci. 179(8), 1169–1192 (2009)

    Article  MathSciNet  Google Scholar 

  20. UCI machine learning repository. http://archive.ics.uci.edu/ml

Download references

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Vasudev Sharma or Balakrushna Tripathy .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Sharma, V., Nagpal, A., Tripathy, B. (2019). A Fuzzy Constraint Based Outlier Detection Method. In: Huang, DS., Huang, ZK., Hussain, A. (eds) Intelligent Computing Methodologies. ICIC 2019. Lecture Notes in Computer Science(), vol 11645. Springer, Cham. https://doi.org/10.1007/978-3-030-26766-7_47

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-26766-7_47

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-26765-0

  • Online ISBN: 978-3-030-26766-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics