Clustering based on a near neighbor graph and a grid cell graph

Chen, Xinquan

doi:10.1007/s10844-013-0236-9

Clustering based on a near neighbor graph and a grid cell graph

Published: 13 March 2013

Volume 40, pages 529–554, (2013)
Cite this article

Journal of Intelligent Information Systems Aims and scope Submit manuscript

Xinquan Chen¹

559 Accesses
10 Citations
Explore all metrics

Abstract

This paper presents two novel graph-clustering algorithms, Clustering based on a Near Neighbor Graph (CNNG) and Clustering based on a Grid Cell Graph (CGCG). CNNG algorithm inspired by the idea of near neighbors is an improved graph-clustering method based on Minimum Spanning Tree (MST). In order to analyze massive data sets more efficiently, CGCG algorithm, which is a kind of graph-clustering method based on MST on the level of grid cells, is presented. To clearly describe the two algorithms, we give some important concepts, such as near neighbor point set, near neighbor undirected graph, grid cell, and so on. To effectively implement the two algorithms, we use some efficient partitioning and index methods, such as multidimensional grid partition method, multidimensional index tree, and so on. From simulation experiments of some artificial data sets and seven real data sets, we observe that the time cost of CNNG algorithm can be decreased by using some improving techniques and approximate methods while attaining an acceptable clustering quality, and CGCG algorithm can approximately analyze some dense data sets with linear time cost. Moreover, comparing some classical clustering algorithms, CNNG algorithm can often get better clustering quality or quicker clustering speed.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Fast and Efficient Grid-Based K-means++ Clustering Algorithm for Large-Scale Datasets

Fast Minimum Spanning Tree Based Clustering Algorithms on Local Neighborhood Graph

An Improved Density Peaks-Based Graph Clustering Algorithm

References

Agrawal, R., Gehrke, J., Gunopolos, D., et al. (1998). Automatic subspace clustering of high dimensional data for data mining application. In Proceeding of the ACM SIGMOD international conference on management of data (pp. 94–105).
Anders, K.H. (2003). A hierarchical graph-clustering approach to find groups of objects. In The 5th workshop on progress in automated map generalization (pp. 1–8).
Cormen, T.H., Leiserson, C.E., Rivest, R.L., et al. (2009). Introduction to algorithms (3rd ed.). Cambridge: The MIT Press.
MATH Google Scholar
Costa, A.F.B.F., Pimentel, B.A., de Souza, R.M.C.R. (2013). Clustering interval data through kernel-induced feature space. Journal of Intelligent Information Systems, 40(1), 109–140.
Article Google Scholar
Ester, M., Kriegel, H.P., Sander, J., Xu, X. (1996). A density-based algorithm for discovering clusters in large spatial data sets with noise. In The 2th international conference on knowledge discovery and data mining (pp. 226–231). Portland.
Frank, A., & Asuncion, A. (2010). UCI machine learning repository. Irvine, CA: University of California, School of Information and Computer Science. http://archive.ics.uci.edu/ml .
Frey, B.J., & Dueck, D. (2007). Clustering by passing messages between data points. Science, 315(16), 972–976.
Article MathSciNet MATH Google Scholar
Gabriel, K., & Sokal, R. (1969). A new statistical approach to geographic variation analysis. Systematic Zoology, 18, 259–278.
Article Google Scholar
Gower, J.C., & Ross, G.J.S. (1969). Minimum spanning trees and single linkage cluster analysis. Applied Statistics, 18(1), 54–64.
Article MathSciNet Google Scholar
Guha, S., Rastogi, R., Shim, K. (1998). Cure: an efficient clustering algorithm for large databases. In Proceeding of the ACM SIGMOD international conference on management of data (pp. 73–84). Seattle: ACM Press.
Google Scholar
Jain, A.K. (2010). Data clustering: 50 years beyond K-means. Pattern Recognition Letters, 31(8), 651–666.
Article Google Scholar
Jain, A.K., Murty, M.N., Flynn, P.J. (1999). Data clustering: a review. ACM Computing Surveys, 31(3), 264–323.
Article Google Scholar
Jaromczyk, J.W., Godfried, T. (1992). Relative neighborhood graphs and their relatives. Proceedings of the IEEE, 80(9), 1502–1517.
Article Google Scholar
Karypis, G., Han, E.H., Kumar, V. (1999). Chameleon: a hierarchical clustering algorithm using dynamic modeling. IEEE Computer, 32(8), 68–75.
Article Google Scholar
Lee, D.T. (1980). Two dimensional voronoi diagram in the l_p metric. Journal of ACM, 27(4), 604–618.
Article MATH Google Scholar
Li, C.B., Yin, W.M., Li, R.R., et al. (2009). Tutorial to data structures (3rd ed.). Beijing: The Tsinghua University Press.
Google Scholar
Schaeffer, S.E. (2007). Graph clustering. Computer Science Review, 1(1), 27–64.
Article MathSciNet Google Scholar
Schölkopf, B., Smola, A., Müller, K.R. (1998). Nonlinear component analysis as a kernel eigenvalue problem. Neural Computation, 10(5), 1299–1319.
Article Google Scholar
Tan, P.N., Steinbach, M., Kumar, V. (2005). Introduction to data mining. Addison Wesley.
Theodoridis, S., & Koutroumbas, K. (2006). Pattern recognition (3rd ed.). Academic Press.
Toussaint, G. (1980). The relative neighborhood graph of a finite planar set. Pattern Recognition, 12(4), 261–268.
Article MathSciNet MATH Google Scholar
Wang, X.C., Wang, X.L., Wilkes, D.M. (2009). A divide-and-conquer approach for minimum spanning tree-based clustering. IEEE Transactions on Knowledge and Data Engineering, 21(7), 945–958.
Article Google Scholar
Wang, W., Yang, J., Muntz, R.R. (1997). STING: a statistical information grid approach to spatial data mining. In Proceedings of the 23rd VLDB conference (pp. 186–195). Athens, Greece.
Yao, A.C. (1975). An O(∣E∣ ·loglog∣V∣) algorithm for finding minimum spanning trees. Information Processing Letters, 4(1), 21–23.
Article MATH Google Scholar
Yao, A.C. (1982). On constructing minimum spanning trees in k-dimensional spaces and related problems. SIAM Journal on Computing, 11(5), 721–736.
Article MathSciNet MATH Google Scholar
Zahn, C.T. (1971). Graph-theoretical methods for detecting and describing gestalt clusters. IEEE Transactions on Computers, C-20(1), 68–86.
Google Scholar
Zhang, N.X. (2006). Algorithms and data structures: Described in C language (2nd ed.). Beijing: The Higher Education Press.
Google Scholar
Zhang, T., Ramakrishnan, R., Linvy, M. (1997). BIRCH: an efficient data clustering method for very large data sets. Data Mining and Knowledge Discovery, 1(2), 141–182.
Article Google Scholar
Zhou, C.M., Miao, D.Q., Wang, R.Z. (2010). A graph-theoretical clustering method based on two rounds of minimum spanning trees. Pattern Recognition, 43(3), 752–766.
Article Google Scholar

Download references

Acknowledgements

The author thanks the editors, the anonymous reviewers and Dr. Peijie Hang for their useful comments and suggestions.

Author information

Authors and Affiliations

School of Computer Science & Engineering, Chongqing Three Gorges University, Chongqing, China
Xinquan Chen

Authors

Xinquan Chen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xinquan Chen.

Electronic Supplementary Material

Below is the link to the electronic supplementary material.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chen, X. Clustering based on a near neighbor graph and a grid cell graph. J Intell Inf Syst 40, 529–554 (2013). https://doi.org/10.1007/s10844-013-0236-9

Download citation

Received: 26 April 2011
Revised: 28 January 2013
Accepted: 31 January 2013
Published: 13 March 2013
Issue Date: June 2013
DOI: https://doi.org/10.1007/s10844-013-0236-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Clustering based on a near neighbor graph and a grid cell graph

Abstract

Access this article

Similar content being viewed by others

A Fast and Efficient Grid-Based K-means++ Clustering Algorithm for Large-Scale Datasets

Fast Minimum Spanning Tree Based Clustering Algorithms on Local Neighborhood Graph

An Improved Density Peaks-Based Graph Clustering Algorithm

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Electronic Supplementary Material

(DOC 35.5 KB)

(DOC 51.5 KB)

(DOC 34.0 KB)

(DOC 741 KB)

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Clustering based on a near neighbor graph and a grid cell graph

Abstract

Access this article

Similar content being viewed by others

A Fast and Efficient Grid-Based K-means++ Clustering Algorithm for Large-Scale Datasets

Fast Minimum Spanning Tree Based Clustering Algorithms on Local Neighborhood Graph

An Improved Density Peaks-Based Graph Clustering Algorithm

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Electronic Supplementary Material

(DOC 35.5 KB)

(DOC 51.5 KB)

(DOC 34.0 KB)

(DOC 741 KB)

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation