Abstract
Outlier detection is a crucial research problem in data mining, aiming to identify data objects that significantly deviate from the distribution of other data. To solve the issues of low-density patterns and low local density problems in nearest neighbor-based outlier detection methods, this paper proposes an outlier detection algorithm based on the relative skewness density ratio outlier factor. An adaptive determination of the number of neighbors (k value) and neighborhood is achieved using the natural neighbor search algorithm, effectively addressing parameter setting challenges. It introduces the concept of relative skewness to quantify how much data objects deviate from their neighbors, along with a local density ratio to capture variations in local density. This leads to a new outlier measure called the Relative Skewness Density Ratio Outlier Factor, which uses the ratio of relative skewness to local density as the outlier factor. The outlier degree of each data object is further assessed by evaluating the deviation of this factor from its neighbors. Experimental validation of the proposed algorithm is conducted on both artificial and real-world datasets, with comparisons against recent novel outlier detection algorithms, demonstrating the effectiveness of the proposed algorithm.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Data Availability
The datasets generated during and/or analysed during the current study are available from the corresponding author on reasonable request.
References
Rebjock Q, Kurt B, Januschowski T et al (2021) Online false discovery rate control for anomaly detection in time series. Adv Neural Inf Process Syst 34:26487–26498
Shen L, Li Z, Kwok J (2020) Timeseries anomaly detection using temporal hierarchical one-class network. Adv Neural Inf Process Syst 33:13016–13026
Aggarwal CC (2017) An Introduction to outlier analysis. Springer, Berlin
Safaei M, Asadi S, Driss M et al (2020) A systematic literature review on outlier detection in wireless sensor networks. Symmetry 12(3):328. https://doi.org/10.3390/sym12030328
Chakraborty D, Narayanan V, Ghosh A (2019) Integration of deep feature extraction and ensemble learning for outlier detection. Patt Recogn 89:161–171. https://doi.org/10.1016/j.patcog.2019.01.002
Andrysiak T (2020) Sparse representation and overcomplete dictionary learning for anomaly detection in electrocardiograms. Neural Comput Appl 32(5):1269–1285. https://doi.org/10.1007/s00521-018-3814-5
Domingues R, Filippone M, Michiardi P et al (2018) A comparative evaluation of outlier detection algorithms: Experiments and analyses. Pattern Recogn 74:406–421. https://doi.org/10.1016/j.patcog.2017.09.037
Bhatti MA, Riaz R, Rizvi SS et al (2020) Outlier detection in indoor localization and internet of things (iot) using machine learning. J Commu Netw 22(3):236–243. https://doi.org/10.1109/JCN.2020.000018
Alghushairy O, Alsini R, Ma X, et al (2021) Improving the efficiency of genetic-based incremental local outlier factor algorithm for network intrusion detection. In: Advances in Artificial Intelligence and Applied Cognitive Computing: Proceedings from ICAI’20 and ACC’20, Springer, pp 1011–1027
Djenouri Y, Belhadi A, Lin JCW et al (2019) A survey on urban traffic anomalies detection algorithms. IEEE Access 7:12192–12205. https://doi.org/10.1109/ACCESS.2019.2893124
Maskey SR, Badsha S, Sengupta S, et al (2020) Bits: Blockchain based intelligent transportation system with outlier detection for smart city. In: 2020 IEEE International Conference on Pervasive Computing and Communications Workshops (PerCom Workshops), IEEE, pp 1–6
Ruff L, Vandermeulen RA, Görnitz N, et al (2019) Deep semi-supervised anomaly detection. CoRR arXiv:1906.02694https://doi.org/10.48550/arXiv.1906.02694
Breiman L (2001) Random forests. Mach Learn 45:5–32. https://doi.org/10.1023/A:1010933404324
LeCun Y, Bengio Y, Hinton G (2015) Deep Learn Nat 521(7553):436–444. https://doi.org/10.1038/nature14539
Akcay S, Atapour-Abarghouei A, Breckon TP (2019) Ganomaly: Semi-supervised anomaly detection via adversarial training. In: Computer Vision–ACCV 2018: 14th Asian Conference on Computer Vision, Perth, Australia, 2–6 December, 2018, Revised Selected Papers, Part III 14, Springer, pp 622–637
Akçay S, Atapour-Abarghouei A, Breckon TP (2019) Skip-ganomaly: Skip connected and adversarially trained encoder-decoder anomaly detection. In: 2019 International Joint Conference on Neural Networks (IJCNN), IEEE, pp 1–8
Zenati H, Romain M, Foo CS, et al (2018) Adversarially learned anomaly detection. In: 2018 IEEE International conference on data mining (ICDM), IEEE, pp 727–736
Ester M, Kriegel HP, Sander J, et al (1996) A density-based algorithm for discovering clusters in large spatial databases with noise. In: kdd, pp 226–231
He Z, Xu X, Deng S (2003) Discovering cluster-based local outliers. Pattern Recogn Lett 24(9–10):1641–1650. https://doi.org/10.1016/S0167-8655(03)00003-5
Rodriguez A, Laio A (2014) Clustering by fast search and find of density peaks. Sci 344(6191):1492–1496. https://doi.org/10.1126/science.1242072
Knorr EM, Ng RT (1998) Algorithms for mining distancebased outliers in large datasets. In: Proceedings of the international conference on very large data bases, Citeseer, pp 392–403
Ramaswamy S, Rastogi R, Shim K (2000) Efficient algorithms for mining outliers from large data sets. In: Proceedings of the 2000 ACM SIGMOD international conference on Management of data, pp 427–438
Zhang K, Hutter M, Jin H (2009) A new local distance-based outlier detection approach for scattered real-world data. In: Advances in Knowledge Discovery and Data Mining: 13th Pacific-Asia Conference, PAKDD 2009 Bangkok, Thailand, 27-30 April, 2009 Proceedings 13, Springer, pp 813–822
Yang J, Rahardja S, Fränti P (2021) Mean-shift outlier detection and filtering. Pattern Recogn 115:107874. https://doi.org/10.1016/j.patcog.2021.107874
Breunig MM, Kriegel HP, Ng RT, et al (2000) Lof: identifying density-based local outliers. In: Proceedings of the 2000 ACM SIGMOD international conference on Management of data, pp 93–104
Tang J, Chen Z, Fu AWC, et al (2002) Enhancing effectiveness of outlier detections for low density patterns. In: Advances in Knowledge Discovery and Data Mining: 6th Pacific-Asia Conference, PAKDD 2002 Taipei, Taiwan, 6–8 May, 2002 Proceedings 6, Springer, pp 535–548
Gao J, Hu W, Zhang Z, et al (2011) RKOF: robust kernel-based local outlier detection. In: Pacific-Asia conference on knowledge discovery and data mining, Springer, pp 270–283
Schubert E, Zimek A, Kriegel HP (2014) Generalized outlier detection with flexible kernel density estimates. In: Proceedings of the 2014 SIAM international conference on data mining, SIAM, pp 542–550
Jin W, Tung AK, Han J, et al (2006) Ranking outliers using symmetric neighborhood relationship. In: Advances in Knowledge Discovery and Data Mining: 10th Pacific-Asia Conference, PAKDD 2006, Singapore, 9-12 April, 2006. Proceedings 10, Springer, pp 577–593
Xiong ZY, Long H, Zhang YF et al (2023) A neighborhood weighted-based method for the detection of outliers. Applied Intell 53(9):9897–9915. https://doi.org/10.1007/s10489-022-03258-0
Zhang J, Yang Y (2023) Density-distance outlier detection algorithm based on natural neighborhood. Axioms 12(5):425. https://doi.org/10.3390/axioms12050425
Li K, Gao X, Jia X et al (2022) Detection of local and clustered outliers based on the density-distance decision graph. Eng Appl Art Intell 110:104719. https://doi.org/10.1016/j.engappai.2022.104719
Guha S, Rastogi R, Shim K (1998) Cure: An efficient clustering algorithm for large databases. ACM Sigmod record 27(2):73–84. https://doi.org/10.1145/276305.276312
Yang L, Zhu Q, Huang J et al (2017) Adaptive edited natural neighbor algorithm. Neurocomput 230:427–433. https://doi.org/10.1016/j.neucom.2016.12.040
Zhu Q, Feng J, Huang J (2016) Natural neighbor: A self-adaptive neighborhood method without parameter k. Pattern Recogn Lett 80:30–36. https://doi.org/10.1016/j.patrec.2016.05.007
Li X, Han Q, Qiu B (2018) A clustering algorithm using skewness-based boundary detection. Neurocomput 275:618–626. https://doi.org/10.1016/j.neucom.2017.09.023
Wahid A, Annavarapu CSR (2021) Nanod: A natural neighbour-based outlier detection algorithm. Neural Comput Appl 33(6):2107–2123. https://doi.org/10.1007/s00521-020-05068-2
Xiong ZY, Gao QQ, Gao Q, et al (2022) Add: a new average divergence difference-based outlier detection method with skewed distribution of data objects. Applied Intell pp 1–25. https://doi.org/10.1007/s10489-021-02399-y
Wahid A, Rao ACS (2022) Rdof: An outlier detection algorithm based on relative density. Exp Syst 39(2):e12859 https://doi.org/10.1111/exsy.12859
Acknowledgements
This work was supported by a grant from The National Natural Science Foundation of China (No.61972334), the National Social Science Foundation of China General Project (No.20BJ122), the Innovation Capability Improvement Plan Project of Hebei Province (No.22567626H), the Local Science and Technology Development Fund Project guided by the Central Government (No.226Z1707G), and the Intelligent image workpiece recognition of Sida Railway (No.x2021134).
Author information
Authors and Affiliations
Contributions
Zhongping Zhang presents the core idea of the model and the experimental method. Kuo Wang implements the model, verifies its validity, and writes the paper. Jinyu Dong and Sen Li provides guidance and revised the paper.
Corresponding author
Ethics declarations
Competing Interest
The author(s) declare no potential conflicts of interest with respect to the research, authorship and/or publication of this paper.
Compliance with Ethical Standards
No ethical data in this paper.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Zhang, Z., Wang, K., Dong, J. et al. SDROF: outlier detection algorithm based on relative skewness density ratio outlier factor. Appl Intell 55, 67 (2025). https://doi.org/10.1007/s10489-024-06092-8
Accepted:
Published:
DOI: https://doi.org/10.1007/s10489-024-06092-8