Weighted natural neighborhood graph: an adaptive structure for clustering and outlier detection with no neighborhood parameter

Zhu, Qingsheng; Feng, Ji; Huang, Jinlong

doi:10.1007/s10586-016-0598-1

Weighted natural neighborhood graph: an adaptive structure for clustering and outlier detection with no neighborhood parameter

Published: 29 July 2016

Volume 19, pages 1385–1397, (2016)
Cite this article

Cluster Computing Aims and scope Submit manuscript

563 Accesses
19 Citations
Explore all metrics

Abstract

This paper aims at dealing with the practical shortages of nearest neighbor based data mining techniques, especially, clustering and outlier detection. In particular, when there are data sets with arbitrary shaped clusters and varying density, it is difficult to determine the proper parameters without a priori knowledge. To address this issue, we define a novel conception called natural neighbor, which can better reflect the relationship between the elements in a data set than k-nearest neighbor does, and we present a graph called weighted natural neighborhood graph for clustering and outlier detection. Furthermore, the whole process needs no parameter to deal with different data sets. Simulations on both synthetic data and real world data show the effectiveness of our proposed method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A New Neighborhood-Based Outlier Detection Technique

A neighborhood weighted-based method for the detection of outliers

Article 12 August 2022

Zhong-Yang Xiong, Hua Long, … Min Zhang

Outlier Detection Based on Cluster Outlier Factor and Mutual Density

References

Jain, A.K.: Data clustering: 50 years beyond k-means. In: Proceedings of the 2008 European Conference on Machine Learning and Knowledge Discovery in Databases—Part I, pp. 3–4 (2008)
Breunig, M.M., Kriegel, H.P., Ng, R.T., Sander, J.: Lof: Identifying densitybased local outliers. ACM Sigmod Record 29(2), 93–104 (2000)
Article Google Scholar
Wang, K., Zhihui, D., Chen, Y., Li, S.: V3COCA: an effective clustering algorithm for complicated objects and its application in breast cancer research and diagnosis. Simul. Model. Pract. Theory 17(2), 454–470 (2009)
Article Google Scholar
Chai, Y., Du, Z., Chen, Y.: An A stepwise optimization algorithm of clustered streaming media servers. J. Syst. Softw. 82(8), 1344–1361 (2009)
Article Google Scholar
Karypis, G., Han, E.H., Kumar, V.: Chameleon: hierarchical clustering using dynamic modeling. Computer 32(8), 68–75 (1999)
Article Google Scholar
Hartigan, J.A., Wong, M.A.: A k-means clustering algorithm. Appl. Stat. 28(1), 100–108 (1979)
Article MATH Google Scholar
Moon, T.K.: The expectation-maximization algorithm. IEEE Signal Process. Mag. 13(6), 47–60 (1996)
Article Google Scholar
Ester, M., Kriegel, H.P., Sander, J., Xu, X.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: Proceedings of the 2nd International Conference Knowledge Discovery and Data Mining (1996)
Ankerst, M., Breunig, M.M., Kriegel, H.P., Sander, J.: Optics: Ordering points to identify the clustering structure. ACM Sigmod Record (Stanford Research Inst Memo Stanford University) 28(2), 49–60 (1999)
Article Google Scholar
Rodriguez, A., Laio, A.: Clustering by fast search and find of density peaks. Science 344(6191), 1492–1496 (2014)
Article Google Scholar
Chen, Lajiao, Ma, Yan, Liu, Peng, Wei, Jingbo, Jie, Wei, He, Jijun: A review of parallel computing for large-scale remote sensing image mosaicking. Clust. Comput. 18(2), 517–529 (2015)
Article Google Scholar
Knorr, E.M., Ng, R.T.: A unified notion of outliers: properties and computation. In: In Proceedigs of the International Conference on Knowledge Discovery & Data Mining, pp. 219–222 (1997)
Ramaswamy, S., Rastogi, R., Shim, K.: Efficient algorithms for mining outliers from large data sets. ACM Sigmod Record 29(2), 427–438 (2000)
Article Google Scholar
Zhang, K., Hutter, M., Jin, H.: A New Local Distance-Based Outlier Detection Approach for Scattered Real-World Data. Springer, Berlin (2009)
Book Google Scholar
Ha, J., Seok, S., Lee, J.S.: Robust outlier detection using the instability factor. Knowl.-Based Syst. 63(3), 1523 (2014)
Google Scholar
Tang, J., Chen, Z., Fu, W.C., Cheung, D.W.: Enhancing effectiveness of outlier detections for low density patterns. In: Pacific-asia Conference on Advances in Knowledge Discovery & Data Mining, pp. 535–548 (2002)
Jin, W., Tung, A.K.H., Han, J., Wang, W.: Ranking outliers using symmetric neighborhood relationship. Lect. Notes Comput. Sci. 3918, 577–593 (2006)
Article Google Scholar
Liu, J., Deng, H.F.: Outlier detection on uncertain data based on local information. Knowl.-Based Syst. 51(1), 60–71 (2013)
Article MathSciNet Google Scholar
Ng, R.T., Han, J.: Efficient and effective clustering methods for spatial data mining. Proc. Vldb Conf. 88(9), 144–155 (1994)
Google Scholar
Ester, M., Kriegel, H.-P., Sander, J., Xu, X.: A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise, pp. 226–231. AAAI Press, Menlo Park (1996)
Google Scholar
Al-Zoubi, M.B., Al-Dahoud, A., Yahya, A.A.: New outlier detection method based on fuzzy clustering. Wseas Trans. Inf. Sci. Appl. 7(5), 681–690 (2010)
Google Scholar
Kaufman, L., Rousseeuw, P.J.: Finding groups in data. An introduction to cluster analysis. J. Am. Stat. Assoc. 90, 773–795 (1990)
Google Scholar
Stevens, S.S.: Mathematics, measurement and psychophysics. In: Stevens, S.S. (ed.) Handbook of Experimental Psychology, pp. 1–49. Wiley, New York (1951)
Google Scholar
Wang, J., Neskovic, P., Cooper, L.N.: Improving nearest neighbor rule with a simple adaptive distance measure. Pattern Recognit. Lett. 28(2), 43–46 (2006)
Google Scholar
García, S., et al.: Prototype selection for nearest neighbor classification: taxonomy and empirical study. IEEE Trans. Pattern Anal. Mach. Intell. 34(3), 417–435 (2012)
Article Google Scholar
Qian, F., et al.: Mining regional co-location patterns with kNNG. J. Intell. Inf. Syst. 42(3), 485–505 (2013)
Article Google Scholar
Ghosh, Anil K.: On optimum choice of k in nearest neighbor classification. Comput. Stat. Data Anal. 50(11), 3113–3123 (2006)
Article MathSciNet MATH Google Scholar
Ghosh, A.K.: On nearest neighbor classification using adaptive choice of k. J. Comput. Gr. Stat. 16(2), 482–502 (2007)
Article MathSciNet Google Scholar
Domeniconi, C., Peng, J., Gunopulos, D.: Locally adaptive metric nearest-neighbor classification. IEEE Trans. Pattern Anal. Mach. Intell. 24(9), 1281–1285 (2002)
Article Google Scholar
Bhattacharya, G., Ghosh, K., Chowdhury, A.S.: An affinity-based new local distance function and similarity measure for kNN algorithm. Pattern Recognit. Lett. 33(3), 356–363 (2012)
Article Google Scholar
Korn, F., Muthukrishnan, S.: Influence sets based on reverse nearest neighbor queries. ACM Sigmod Record 29(2), 201–212 (2000)
Article Google Scholar
Yiu, M.L., Mamoulis, N.: Reverse nearest neighbors search in Ad Hoc subspaces. IEEE Trans. Knowl. Data Eng. 19(3), 412–426 (2007)
Article Google Scholar
Wang, S., Chai, S., Qiannan, L.V.: A pruning based continuous RkNN query algorithm for large k. Chin. J. Electron. 21(3), 523–527 (2012)
Google Scholar
Brito, M.R., et al.: Connectivity of the mutual k-nearest-neighbor graph in clustering and outlier detection. Stat. Probab. Lett. 35(1), 33–42 (1997)
Article MathSciNet MATH Google Scholar
Tang, B., He, H.: ENN: extended nearest neighbor method for pattern recognition [research frontier]. IEEE Comput. Intell. Mag. 10(3), 52–60 (2015)
Article MathSciNet Google Scholar
Shivakumara, P., et al.: A novel mutual nearest neighbor based symmetry for text frame classification in video. Pattern Recognit. 44(8), 1671–1683 (2011)
Huang, H, et al.: Towards effective and efficient mining of arbitrary shaped clusters. In: 2014 IEEE 30th International Conference on Data Engineering (ICDE) (2014)
Xuan, J., Luo, X., Zhang, G., Lu, J., Xu, Z.: Uncertainty analysis for the keyword system of web events. IEEE Trans. Syst. Man Cybern. 46(6), 829–842 (2016)
Article Google Scholar
Wei, X., Luo, X., Li, Q., Zhang, J., Xu, Z.: Online comment-based hotel quality automatic assessment using improved fuzzy comprehensive evaluation and fuzzy cognitive map. IEEE Trans. Fuzzy Syst. 23(1), 72–84 (2015)
Article Google Scholar
UCI Repository of Machine Learning Databases. University of California, Irvine, CA. http://www.ics.uci.edu/mlearn/MLRepository.html/

Download references

Acknowledgments

This work was supported by the National Nature Science Foundation of China (No. 61272194 and No. 61073058) and Natural Science Foundation Project of CQ CSTC ( cstc2013jcyjA 40049).

Author information

Authors and Affiliations

Chongqing Key Laboratory of Software Theory and Technology College of Computer Science, Chongqing University, Chongqing, 400044, China
Qingsheng Zhu, Ji Feng & Jinlong Huang

Authors

Qingsheng Zhu
View author publications
You can also search for this author in PubMed Google Scholar
Ji Feng
View author publications
You can also search for this author in PubMed Google Scholar
Jinlong Huang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ji Feng.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhu, Q., Feng, J. & Huang, J. Weighted natural neighborhood graph: an adaptive structure for clustering and outlier detection with no neighborhood parameter. Cluster Comput 19, 1385–1397 (2016). https://doi.org/10.1007/s10586-016-0598-1

Download citation

Received: 09 March 2016
Revised: 18 June 2016
Accepted: 02 July 2016
Published: 29 July 2016
Issue Date: September 2016
DOI: https://doi.org/10.1007/s10586-016-0598-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Weighted natural neighborhood graph: an adaptive structure for clustering and outlier detection with no neighborhood parameter

Abstract

Access this article

Similar content being viewed by others

A New Neighborhood-Based Outlier Detection Technique

A neighborhood weighted-based method for the detection of outliers

Outlier Detection Based on Cluster Outlier Factor and Mutual Density

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Weighted natural neighborhood graph: an adaptive structure for clustering and outlier detection with no neighborhood parameter

Abstract

Access this article

Similar content being viewed by others

A New Neighborhood-Based Outlier Detection Technique

A neighborhood weighted-based method for the detection of outliers

Outlier Detection Based on Cluster Outlier Factor and Mutual Density

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation