Abstract
Spatial data mining seeks to discover meaningful patterns from data where a prime dimension of interest is geographical location. Consideration of a spatial dimension becomes important when data either refer to specific locations and/or have significant spatial dependence which needs to be considered if meaningful patterns are to emerge. For point data there are two main groups of approaches. One stems from traditional statistical techniques such as k-means clustering in which every point is assigned to a spatial grouping and results in a spatial segmentation. The other broad approach searches for ‘hotspots’ which can be loosely defined as a localised excess of some incidence rate. Not all points are necessarily assigned to clusters. This paper presents a novel variable resolution approach to cluster discovery which acts in the first instance to define spatial concentrations within the data thus allowing the nature of clustering to be defined. The cluster centroids are then used to establish initial cluster centres in a k-means clustering and arrive at a segmentation on the basis of point attributes. The variable resolution technique can thus be viewed as a bridge between the two broad approaches towards knowledge discovery in mining point data sets. Applications of the technique to date include the mining of business, crime, health and environmental data.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Miller, H. J. and Han, J. (2001) Geographic Data Mining and Knowledge Discovery. aylor & Francis, London.
Macmillan, W. (1998) Epilogue. In Longley et al. (eds) Geocomputation: A Primer. Chichester: Wiley: 257–264
Fotheringham, A. S. (1992) Exploratory spatial data analysis and GIS. Environment and Planning A 24: 1675–1678
Unwin, D. (1996) GIS, spatial analysis and spatial statistics. Progress in Human Geography 20: 540–551
Snow, J. (1855) On the Mode of Communication of Cholera. Churchill Livingstone, London.
Clark, P. J. and Evans, F. C. (1954) Distance to nearest neighbour as a measure of spatial relations in populations. Ecology 35: 445–453
Knox, E. G. (1964) The detection of space-time interactions. Applied Statistics 13: 25–29
Harvey, D. W. (1966) Geographical processes and point patterns: testing models of diffusion by quadrat sampling. Transactions of the Institute of British Geographers 40: 81–95
Mantel, M. (1967) The detection of disease clustering and a generalised regression approach. Cancer Research 27: 209–220
Cliff, A. D. and Ord, J. K. (1981) Spatial Processes: Models and Applications. Pion, London.
Couclelis, H. (1998) Computation and space. Environment & Planning B, 25th Anniversary Issue: 41–47
Fotheringham, A. S. (1998) Trends in quantitative methods II: Stressing the computational. Progress in Human Geography 22: 283–292
Longley, P. A.; Brooks, S. M.; McDonnell, R. & MacMillan, B. (1998). Geocomputation: A Primer. Chichester: Wiley.
Armstrong, M. P. (2000) Geography and computational science. Annals of the Association of American Geographers 90: 146–156
Openshaw, S. and Abrahart, R. J. (2000) GeoComputation. Taylor & Francis, London.
Brimicombe, A. J. (2002) GIS: where are the frontiers now? Proceedings GIS 2002, Bahrain: 33–45
Fotheringham, A. S. (1997) Trends in quantitative methods I: Stressing the local. Progress in Human Geography 21: 88–96
Fotheringham, A. S. and Brunsdon, C. (1999) Local forms of spatial analysis. Geographical Analysis 31: 340–358
Fotheringham, A. S.; Brunsdon, C. and Charlton, M. (2000) Quantitative Geography. Sage, London.
Phillips, J. D. (1999) Spatial analysis in physical geography and challenge of deterministic uncertainty. Geographical Analysis 31: 359–372
Murray, A. T. and Estivill-Castro, V. (1998) Cluster discovery techniques for exploratory spatial data analysis. International Journal of Geographical Information Science 12: 431–443
Openshaw, S. (1998) Building automated geographical analysis and explanation machines. In Longley et al. (eds) Geocomputation: A Primer. Chichester: Wiley: 95–115
Murray, A. T. (2000) Spatial characteristics and comparisons of interaction and median clustering models. Geographical Analysis 32: 1–18
Halls, P.J.; Bulling, M.; White, P. C. L.; Garland, L. and Harris S. (2001) Dirichlet neighbours: revisiting Dirichlet tessellation for neighbourhood analysis. Computers, Environment and Urban Systems 25: 105–117
Kiang, M. Y. (2001) Extending the Kohonen self-organizing map networks for clustering analysis. Computational Statistics & Data Analysis 38: 161–180
Estivill-Castro, V. and Lee, I. (2002) Argument free clustering for large spatial point-data sets via boundary extraction from Delaunay Diagram. Computers, Environment and Urban Systems 26: 315–334
Sokal, R. and Sneath, P. (1963) Principles of Numerical Taxonomy. Freeman, San Francisco.
Aldenderfer, M. S. and Blashfield, R. K. (1984) Cluster Analysis. Sage, California.
Han J.; Kamber, M. and Tung, A. (2001) Spatial clustering methods in data mining. In Miller & Han (eds.) Geographic Data Mining and Knowledge Discovery. Taylor & Francis, London: 188–217
MacQueen, J. (1967) Some methods for classification and analysis of multivariate observations. Proceedings of the 5th Berkeley Symposium on Maths and Statistics Problems Vol1: 281–297
Openshaw, S.; Charlton, M. E.; Wymer, C. and Craft, A. W. (1987) A mark I geographical analysis machine for the automated analysis of point data sets. International Journal of Geographical Information Systems 1: 359–377
Openshaw, S. (1994) Two exploratory space-time attribute pattern analysers relevant to GIS. In Fotheringham & Rogerson (eds.) Spatial Analysis and GIS. Taylor & Francis, London: 83–104
Rowlingson, B. S. and Diggle, P. J. (1993) Splancs: spatial point pattern analysis code in S-Plus. Computers and Geosciences 19: 627–655
Gatrell, A. C. and Rowlingson, B. S. (1994) Spatial point process modelling in a geographical information system environment. In Fotheringham & Rogerson (eds.) Spatial Analysis and GIS. Taylor & Francis, London: 147–163
Gatrell, A. C.; Bailey, T. C.; Diggle, P. J. and Rowlingson, B. S. (1996) Spatial point pattern analysis and its application in geographical epidemiology. Transactions of the Institute of British Geographers NS 21: 256–274
Lawson, A. B. (2001) Statistical Methods in Spatial Epidemiology. John Wiley & Sons, Chichester.
Tsui, H. Y. and Brimicombe, A. J. (1997a) Adaptive recursive tessellations (ART) for Geographical Information Systems. International Journal of Geographical Information Science 11: 247–263
Tsui, H. Y. and Brimicombe, A. J. (1997b) Hierarchical tessellations model and its use in spatial analysis. Transactions in GIS 2: 267–279
Brimicombe, A. J. and Tsui H. Y. (2000) A variable resolution, geocomputational approach to the analysis of point patterns. Hydrological Processes 14: 2143–2155
Openshaw, S. and Blake, M. (1996) GB Profiler 91. Department of Geography, University of Leeds.
Brimicombe, A. J. (1999) Small may be beautiful — but is simple sufficient?”. Geographical and Environmental Modelling 3: 9–33
Brimicombe, A. J. (2000) Constructing and evaluating contextual indices using GIS: a case of primary school performance” Environment & Planning A 32: 1909–1933
Tukey, J.W. (1977) Exploratory Data Analysis. Addison-Wesley, Reading, MA.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2003 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Brimicombe, A.J. (2003). A Variable Resolution Approach to Cluster Discovery in Spatial Data Mining. In: Kumar, V., Gavrilova, M.L., Tan, C.J.K., L’Ecuyer, P. (eds) Computational Science and Its Applications — ICCSA 2003. ICCSA 2003. Lecture Notes in Computer Science, vol 2669. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44842-X_1
Download citation
DOI: https://doi.org/10.1007/3-540-44842-X_1
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-40156-8
Online ISBN: 978-3-540-44842-6
eBook Packages: Springer Book Archive