Skip to main content
Log in

NaNOD: A natural neighbour-based outlier detection algorithm

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

Outlier detection is an essential task in data mining applications which include, military surveillance, tax fraud detection, telecommunication, etc. In recent years, outlier detection received significant attention compared to other problem of discoveries. The focus on this has resulted in the growth of several outlier detection algorithms, mostly concerning the strategy based on distance or density. However, each strategy has intrinsic weaknesses. The distance-based techniques have the problem of local density, while the density-based method is recognized as having an issue of a low-density pattern. Also, most of the existing outlier detection algorithms have a parameter selection problem, which leads to poor detection results. In this article, we present an unsupervised density-based outlier detection algorithm to deal with these shortcomings. The proposed algorithm uses a Natural Neighbour (NaN) concept, to obtain a parameter called Natural Value (NV) adaptively, and a Weighted Kernel Density Estimation (WKDE) method to estimate the density at the location of an object. Besides, our proposed algorithm employed two different categories of nearest neighbours, k Nearest Neighbours (kNN), and Reverse Nearest Neighbours (RNN), which make our system flexible in modelling different data patterns. A Gaussian kernel function is adopted to achieve smoothness in the measure. Further, we use an adaptive kernel width concept to enhance the discrimination power between normal and outlier samples. The formal analysis and extensive experiments carried out on both artificial and real datasets demonstrate that this technique can achieve better outlier detection performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

Notes

  1. http://www.archive.ics.uci.edu/ml/.

References

  1. Gladitz J, Barnett V, Lewis T (1988) Outliers in statistical data. Biom J 30(7):866–867 (john wiley & sons, chi-chester–new york–brisbane–toronto–singapore, 1984, xiv, 463 s., 26 abb.,£ 29.95, isbn 0471905070)

    Article  Google Scholar 

  2. Chandola V, Banerjee A, Kumar V (2009) Anomaly detection: a survey. ACM Comput Surv (CSUR) 41(3):15

    Article  Google Scholar 

  3. Ramotsoela D, Abu-Mahfouz A, Hancke G (2018) A survey of anomaly detection in industrial wireless sensor networks with critical water system infrastructure as a case study. Sensors 18(8):2491

    Article  Google Scholar 

  4. Kirlidog M, Asuk C (2012) A fraud detection approach with data mining in health insurance. Proc Soc Behav Sci 62:989–994

    Article  Google Scholar 

  5. Andrysiak T (2020) Sparse representation and overcomplete dictionary learning for anomaly detection in electrocardiograms. Neural Comput Appl 32(5):1269–1285

    Article  Google Scholar 

  6. Denning DE (1987) An intrusion-detection model. IEEE Trans Softw Eng SE-13(2):222–232

    Article  Google Scholar 

  7. Wang B, Mao Z (2020) Detecting outliers in industrial systems using a hybrid ensemble scheme. Neural Comput Appl 32(12):8047–8063

    Article  Google Scholar 

  8. Ngai EW, Hu Y, Wong YH, Chen Y, Sun X (2011) The application of data mining techniques in financial fraud detection: a classification framework and an academic review of literature. Decis Support Syst 50(3):559–569

    Article  Google Scholar 

  9. Chan KY, Kwong C, Fogarty TC (2010) Modeling manufacturing processes using a genetic programming-based fuzzy regression with detection of outliers. Inf Sci 180(4):506–518

    Article  MathSciNet  Google Scholar 

  10. Barnett V, Lewis T (1974) Outliers in statistical data. Wiley, Chichester

    MATH  Google Scholar 

  11. Hodge V, Austin J (2004) A survey of outlier detection methodologies. Artif Intell Rev 22(2):85–126

    Article  Google Scholar 

  12. Breunig MM, Kriegel H-P, Ng RT, Sander J (2000) Lof: identifying density-based local outliers. In: ACM sigmod record, Vol. 29, ACM, pp 93–104

  13. Schubert E, Zimek A, Kriegel H-P (2014) Generalized outlier detection with flexible kernel density estimates. In: Proceedings of the 2014 SIAM International Conference on data mining, SIAM, pp 542–550

  14. Tang B, He H (2017) A local density-based approach for outlier detection. Neurocomputing 241:171–180

    Article  Google Scholar 

  15. Vázquez FI, Zseby T, Zimek A (2018) Outlier detection based on low density models. In: 2018 IEEE International Conference on Data Mining Workshops (ICDMW), IEEE, pp 970–979

  16. Xie J, Xiong Z, Dai Q, Wang X, Zhang Y (2020) A local-gravitation-based method for the detection of outliers and boundary points. Knowl-Based Syst 192:105331

    Article  Google Scholar 

  17. Huang J, Zhu Q, Yang L, Feng J (2016) A non-parameter outlier detection algorithm based on natural neighbor. Knowl-Based Syst 92:71–77

    Article  Google Scholar 

  18. Schubert E, Zimek A, Kriegel H-P (2014) Local outlier detection reconsidered: a generalized view on locality with applications to spatial, video, and network outlier detection. Data Min Knowl Discov 28(1):190–237

    Article  MathSciNet  Google Scholar 

  19. Zhu Q, Feng J, Huang J (2016) Natural neighbor: a self-adaptive neighborhood method without parameter k. Pattern Recognit Lett 80:30–36

    Article  Google Scholar 

  20. Tang J, Chen Z, Fu AW-C, Cheung DW (2002) Enhancing effectiveness of outlier detections for low density patterns. In: Pacific-Asia Conference on Knowledge Discovery and Data Mining, Springer, pp 535–548

  21. Jin W, Tung AK, Han J, Wang W (2006) Ranking outliers using symmetric neighborhood relationship. In: Pacific-Asia Conference on Knowledge Discovery and Data Mining, Springer, pp 577–593

  22. Latecki LJ, Lazarevic A, Pokrajac D (2007) Outlier detection with kernel density functions. In: International Workshop on Machine Learning and Data Mining in Pattern Recognition, Springer, pp 61–75

  23. Gao J, Hu W, Zhang ZM, Zhang X, Wu O (2011) Rkof: robust kernel-based local outlier detection. In: Pacific-Asia Conference on knowledge discovery and data mining, Springer, pp 270–283

  24. Li J-B, Pan J-S, Lu Z-M (2009) Kernel optimization-based discriminant analysis for face recognition. Neural Comput Appl 18(6):603–612

    Article  Google Scholar 

  25. Pan J-S, Li J-B, Lu Z-M (2008) Adaptive quasiconformal kernel discriminant analysis. Neurocomputing 71(13–15):2754–2760

    Article  Google Scholar 

  26. Bentley JL (1975) Multidimensional binary search trees used for associative searching. Commun ACM 18(9):509–517

    Article  Google Scholar 

  27. Zhang L, Lin J, Karim R (2018) Adaptive kernel density-based anomaly detection for nonlinear systems. Knowl-Based Syst 139:50–63

    Article  Google Scholar 

  28. Silverman BW (2018) Density estimation for statistics and data analysis. Routledge, Boca Raton

    Book  Google Scholar 

  29. Ramaswamy S, Rastogi R, Shim K (2000) Efficient algorithms for mining outliers from large data sets. In: ACM Sigmod record, Vol. 29, ACM, pp. 427–438

  30. Hautamaki V, Karkkainen I, Franti P (2004) Outlier detection using k-nearest neighbour graph. In: Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004., Vol. 3, IEEE, pp 430–433

  31. Ha J, Seok S, Lee J-S (2014) Robust outlier detection using the instability factor. Knowl-Based Syst 63:15–23

    Article  Google Scholar 

  32. Kriegel H-P, Kroger P, Schubert E, Zimek A (2011) Interpreting and unifying outlier scores. In: Proceedings of the 2011 SIAM International Conference on Data Mining, SIAM, pp 13–24

  33. Lee J-S, Olafsson S (2013) A meta-learning approach for determining the number of clusters with consideration of nearest neighbors. Inf Sci 232:208–224

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgements

We would like to show our gratitude to Professor Jin Ningour from Department of Computer Science and Engineering, University of Electronic Science and Technology of China, China, and Professor Jinlong Huang from Chongqing Key Lab of Software Theory and Technology, College of Computer Science, Chongqing University, China, for providing some artificial datasets for this research.

We would also like to show our gratitude and thanks to the Department of Computer Science and Engineering, Indian Institute of Technology (Indian School of Mines), Dhanbad, India, for providing the facility and support for this research work. The authors would like to thank the associate editor and anonymous referees for their helpful and constructive comments.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Abdul Wahid.

Ethics declarations

Conflict of interest

We hereby declare that we have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wahid, A., Annavarapu, C.S.R. NaNOD: A natural neighbour-based outlier detection algorithm. Neural Comput & Applic 33, 2107–2123 (2021). https://doi.org/10.1007/s00521-020-05068-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-020-05068-2

Keywords

Navigation