Abstract
How to set k value for k-nearest neighbors is a primary problem in machine learning, pattern recognition and knowledge discovery. Natural neighbor (NaN) is an adaptive neighbor concept for solving this problem, which combines k-nearest neighbors and reverse k-nearest neighbors to adaptively obtain k value. It has been proven effective in clustering analysis, classification and outlier detection. However, the existing algorithms for searching NaN all use a global search strategy, which increases unnecessary consumption of time on non-critical points. In this paper, we propose a novel accelerated algorithm for searching natural neighbor, called ASNN. It is based on the fact that if the remote objects have NaNs, others certainly have the NaNs. The main idea of ASNN is that it first extracts remote points, then only searches the neighbors of remote points, instead of all points, so that ASNN can quickly obtain the natural neighbor eigenvalue (NaNE). To identify the remote objects, ASNN only searches the 1-nearest neighbor for each object with kd-tree, so its time complexity is reduced to \(\boldsymbol{O(nlogn)}\) from \(\boldsymbol{O(n^2)}\), and the local search strategy makes it run faster than the existing algorithms. To illustrate the efficiency of ASNN, we compare it with three existing algorithms NaNs, kd-NaN and FSNN. The experiments on synthetic and real datasets tell that ASNN runs much faster than NaNs, kd-NaN and FSNN, especially for datasets with large scale.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Abu Alfeilat, H.A., et al.: Effects of distance measure choice on k-nearest neighbor classifier performance: a review. Big Data 7(4), 221–248 (2019)
Asuncion, A., Newman, D.: UCI machine learning repository (2007)
Chen, Y.: Fast density peak clustering for large scale data based on kNN. Knowl.-Based Syst. 187, 104824 (2020)
Chen, Y.: KNN-BLOCK DBSCAN: fast clustering for large-scale data. IEEE Trans. Syst. Man Cybern. Syst. 51(6), 3939–3953 (2021). https://doi.org/10.1109/TSMC.2019.2956527
Cheng, D., Huang, J., Zhang, S., Zhang, X., Luo, X.: A novel approximate spectral clustering algorithm with dense cores and density peaks. IEEE Trans. Syst. Man Cybern. Syst. 52(4), 2348–2360 (2022). https://doi.org/10.1109/TSMC.2021.3049490
Cheng, D., Zhang, S., Huang, J.: Dense members of local cores-based density peaks clustering algorithm. Knowl.-Based Syst. 193, 105454 (2020)
Cheng, D., Zhu, Q., Huang, J., Wu, Q., Yang, L.: A novel cluster validity index based on local cores. IEEE Trans. Neural Netw. Learn. Syst. 30(4), 985–999 (2019)
Cheng, D., Zhu, Q., Huang, J., Wu, Q., Yang, L.: Clustering with local density peaks-based minimum spanning tree. IEEE Trans. Knowl. Data Eng. 33(2), 374–387 (2021)
Ester, M., Kriegel, H.P., Sander, J., Xu, X.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: KDD, vol. 96, pp. 226–231 (1996)
Huang, J., Zhu, Q., Yang, L., Cheng, D., Wu, Q.: A novel outlier cluster detection algorithm without top-n parameter. Knowl.-Based Syst. 121, 32–40 (2017)
Huang, J., Zhu, Q., Yang, L., Feng, J.: A non-parameter outlier detection algorithm based on natural neighbor. Knowl.-Based Syst. 92, 71–77 (2016)
Jiang, A., Liu, J., Zhou, J., Zhang, M.: Skeleton extraction from point clouds of trees with complex branches via graph contraction. Vis. Comput. 37(8), 2235–2251 (2021). https://doi.org/10.1007/s00371-020-01983-6
Li, J., Zhu, Q., Wu, Q., Fan, Z.: A novel oversampling technique for class-imbalanced learning based on smote and natural neighbors. Inf. Sci. 565, 438–455 (2021)
Li, J., et al.: SMOTE-NaN-DE: addressing the noisy and borderline examples problem in imbalanced classification by natural neighbors and differential evolution. Knowl.-Based Syst. 223, 107056 (2021)
Man, L., Mamoulis, N.: Reverse nearest neighbors search in ad-hoc subspaces. IEEE Trans. Knowl. Data Eng. 19(3), 412–426 (2007)
Rodriguez, A., Laio, A.: Clustering by fast search and find of density peaks. Science 344(6191), 1492–1496 (2014)
Srinilta, C., Kanharattanachai, S.: Application of natural neighbor-based algorithm on oversampling smote algorithms. In: 2021 7th International Conference on Engineering, Applied Sciences and Technology (ICEAST), pp. 217–220. IEEE (2021)
Stevens, S.S.: Mathematics, measurement, and psychophysics (1951)
Wahid, A., Annavarapu, C.S.R.: NaNOD: a natural neighbour-based outlier detection algorithm. Neural Comput. Appl. 33(6), 2107–2123 (2021). https://doi.org/10.1007/s00521-020-05068-2
Wu, Z., Zeng, Y., Li, D., Liu, J., Feng, L.: High-volume point cloud data simplification based on decomposed graph filtering. Autom. Constr. 129, 103815 (2021)
Yang, L., Zhu, Q., Huang, J., Cheng, D., Wu, Q., Hong, X.: Natural neighborhood graph-based instance reduction algorithm without parameters. Appl. Soft Comput. 70, 279–287 (2018)
Yuan, M., Zhu, Q.: Spectral clustering algorithm based on fast search of natural neighbors. IEEE Access 8, 67277–67288 (2020)
Zhang, Y., Ding, S., Wang, Y., Hou, H.: Chameleon algorithm based on improved natural neighbor graph generating sub-clusters. Appl. Intell. 51(11), 8399–8415 (2021). https://doi.org/10.1007/s10489-021-02389-0
Zhao, S., Li, J.: A semi-supervised self-training method based on density peaks and natural neighbors. J. Ambient Intell. Humaniz. Comput. 12(2), 2939–2953 (2021). https://doi.org/10.1007/s12652-020-02451-8
Zhu, Q., Feng, J., Huang, J.: Natural neighbor: a self-adaptive neighborhood method without parameter K. Pattern Recogn. Lett. 80, 30–36 (2016)
Zou, X.L., Zhu, Q.S., Yang, R.L.: Natural nearest neighbor for Isomap algorithm without free-parameter. In: Advanced Materials Research, vol. 219, pp. 994–998. Trans Tech Publications (2011)
Acknowledgements
This work is supported in part by National Natural Science Foundation of China under Grant 62006029, in part by Postdoctoral Innovative Talent Support Program of Chongqing under Grant CQBX2021024, in part by Natural Science Foundation of Chongqing (China) under Grant cstc2019jcyj-msxmX0683, cstc2020jscxlyjsAX0008, and in part by Project of Chongqing Municipal Education Commission, China under Grant KJQN202001434.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Cheng, D., Luo, J., Huang, J., Zhang, S. (2022). ASNN: Accelerated Searching for Natural Neighbors. In: Li, T., et al. Big Data. BigData 2022. Communications in Computer and Information Science, vol 1709. Springer, Singapore. https://doi.org/10.1007/978-981-19-8331-3_3
Download citation
DOI: https://doi.org/10.1007/978-981-19-8331-3_3
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-19-8330-6
Online ISBN: 978-981-19-8331-3
eBook Packages: Computer ScienceComputer Science (R0)