Abstract
Exploratory spatial analysis is increasingly necessary as larger spatial data is managed in electro-magnetic media. We propose an exploratory method that reveals a robust clustering hierarchy from 2-D point data. Our approach uses the Delaunay diagram to incorporate spatial proximity. It does not require prior knowledge about the data set, nor does it require preconditions. Multi-level clusters are successfully discovered by this new method in only O(nlogn) time, where n is the size of the data set. The efficiency of our method allows us to construct and display a new type of tree graph that facilitates understanding of the complex hierarchy of clusters. We show that clustering methods adopting a raster-like or vector-like representation of proximity are not appropriate for spatial clustering. We conduct an experimental evaluation with synthetic data sets as well as real data sets to illustrate the robustness of our method.
Similar content being viewed by others
References
R. Agrawal, J. Gehrke, D. Gunopulos, and P. Raghavan. “Automatic subspace clustering of high dimensional data for data mining applications,” Proceedings of the ACM SIGMOD International Conference on Management of Data, Seattle, Washington, 94-105, 1998.
M.S. Aldenderfer and R.K. Blashfield. Cluster Analysis, Sage Publications: Beverly Hills, 1984.
D. Allard. “Non parametric maximum likelihood estimation of features in spatial point processes using voronoi tessellation,” Technical Report 293, Department of Statistics, University of Washington, 1995.
M. Ankerst, M.M. Breunig, H.-P. Kriegel, and J. Sander. “OPTICS: Ordering points to identify the clustering structure,” Proceedings of the ACM SIGMOD International Conference on Management of Data, Philadelphia, Pennsylvania, 49-60, 1999.
T.C. Bailey and A.C. Gatrell. Interactive Spatial Analysis, Wiley: New York, 1995.
G. Di Battista, P. Eades, R. Tamassia, and I.G. Tollis. Graph Drawing—Algorithms for the Visualization of Graphs, Prentice-Hall, 1999.
B.N. Boots. “Using angular properties of delaunay triangles to evaluate point patterns,” Geographical Analysis, Vol. 18(3):250-260, 1986.
B.N. Boots. “Edge length properties of random voronoi polygons,” Metallography, Vol. 20:231-236, 1987.
P.A. Burrough. Principles of Geographical Information Systems for Land Resources Assessment, Oxford: New York, 1986.
S.D. Byers and A.E. Raftery. “Nearest neighbor clutter removal for estimating features in spatial point processes,” Journal of American Statistics Association, Vol. 93:577-584, 1998.
M.T. Dickerson, R.L.S. Drysdale, and J. Sack. “Simple algorithms for enumerating interpoint distances and finding k nearest neighbors,” International Journal of Computational Geometry and Applications, Vol. 2(3):221-239, 1992.
C. Eldershaw and M. Hegland. “Cluster analysis using triangulation,” in B.J. Noye, M.D. Teubner, and A.W. Gill (Eds), Computational Techniques and Applications: CTAC97, World Scientific: Singapore, 201-208, 1997.
M. Ester, M.-P. Kriegel, J. Sander, and X. Xu. “A density-based algorithm for discovering clusters in large spatial databases with noise,” Proceedings of the 2nd International Conference on Knowledge Discovery and Data Mining, Portland, Oregon, 226-231, 1996.
V. Estivill-Castro and M.E. Houle. “Robust clustering of large geo-referenced data sets,” Proceedings of the 3rd Pacific-Asia Conference on Knowledge Discovery and Data Mining, Beijing, China, 327-337, 1999.
C. Fraley and A.E. Raftery. “How many clusters? which clustering method? answers via model-based cluster analysis,” The Computer Journal, Vol. 41(8):578-588, 1998.
J.A. Hartigan. Clustering Algorithms, John Wiley & Sons, 1975.
C.M. Gold. “Problems with handling spatial data—The voronoi approach,” CISM Journal ACSGC, Vol. 45(1):65-80, 1991.
C.M. Gold. “The meaning of neighbour,” Theories and Methods of Spatio-Temporal Reasoning in Geographic Space, Lecture Notes in Computer Science 639, Springer-Verlag: Berlin, 220-235, 1992.
S. Guha, R. Rastogi, and K. Shim. “CURE: An efficient clustering algorithm for large databases,” Proceedings of the ACM SIGMOD International Conference on Management of Data, Seattle, Washington, 73-84, 1998.
S. Guha, R. Rastogi, and K. Shim. “ROCK: A robust clustering algorithm for categorical attributes,” Proceedings of the International Conference on Data Engineering, Sydney, Australia, 512-521, 1999.
L. Kang, T. Kim, and K. Li. “A spatial data mining method by delaunay triangulation,” Proceedings of the 5th International Workshop on Advances in Geographic Information Systems (GIS-97), LasVegas, Nevada, 35-39, 1997.
G. Karypis, E. Han, and V. Kumar. “CHAMELEON: A hierarchical clustering algorithm using dynamic modeling,” IEEE Computer: Special Issue on Data Analysis and Mining, Vol. 32(8):68-75, 1999.
D. Krznaric and C. Levcopoulos. “Computing hierarchies of clusters from the euclidean minimum spanning tree in linear time,” Proceedings of the 15th Conference on Foundations of Software Technology and Theoretical Computer Science, Lecture Notes in Computer Science 1026, Springer-Verlag: Berlin, 443-455, 1995.
D. Krznaric and C. Levcopoulos. “The first subquadratic algorithm for complete linkage clustering,” Proceedings of the 6th International Symposium on Algorithms and Computation, Lecture Notes in Computer Science 1004, Springer-Verlag: Berlin 392-401, 1995.
G. Liotta. “Low degree algorithm for computing and checking gabriel graphs,” Report No. CS-96-28, Department of Computer Science, Brown University, Providence, 1996.
R.E. Miles. “On the homogeneous planar poisson point process,” Mathematical Biosciences, Vol. 6:85-127, 1970.
F. Murtagh. “Comments of parallel algorithms for hierarchical clustering and cluster validity,” IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 14(10):1056-1057, 1992.
R. Ng and J. Han. “Efficient and effective clustering method for spatial data mining,” Proceedings of 1994 International Conference on Very Large Data Bases (VLDB'94), Santiago, Chile, 144-155, 1994.
A. Okabe, B.N. Boots, and K. Sugihara. Spatial Tessellations: Concepts and Applications of Voronoi Diagrams, John Wiley & Sons: West Sussex, 1992.
S. Openshaw. “A mark 1 geographical analysis machine for the automated analysis of point data sets,” International Journal of GIS, Vol. 1(4):335-358, 1987.
S. Openshaw. “Two exploratory space-time-attribute pattern analysers relevant to GIS,” in S. Fotheringham and P. Rogerson (Eds.), Spatial Analysis and GIS, Taylor and Francis: London, 83-104, 1994.
W.A. Spitzig, J.F. Kelly, and O. Richmond. “Quantitative characterization of second-phase populations,” Metallography, Vol. 18:235-261, 1985.
W.R. Tobler. “A computer movie simulating urban growth in the detroit region,” Economic Geography, Vol. 46(2):234-240, 1970.
W. Wang, J. Yang, and R. Muntz. “STING: A statistical information grid approach to spatial data mining,” Proceedings of the 23rd VLDB Conference, Athens, Greece, 186-195, 1997.
W. Wang, J. Yang, and R. Muntz. “STING +: An approach to active spatial data mining,” Proceedings of the International Conference on Data Engineering, Sydney, Australia, 116-125, 1999.
C.T. Zahn. “Graph-Theoretical methods for detecting and describing gestalt clusters,” IEEE Transactions of Computers, Vol. C-20(1):68-86, 1971.
T. Zhang, R. Ramakrishnan, and M. Linvy. “BIRCH: An efficient data clustering method for very large databases,” Proceedings of the ACM SIGMOD International Conference on Management of Data, Montreal, Canada, 103-114, 1996.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Estivill-Castro, V., Lee, I. Multi-Level Clustering and its Visualization for Exploratory Spatial Analysis. GeoInformatica 6, 123–152 (2002). https://doi.org/10.1023/A:1015279009755
Issue Date:
DOI: https://doi.org/10.1023/A:1015279009755