Scales, levels and processes: Studying spatial patterns of British census variables
Introduction
The modifiable areal unit problem (MAUP) is a phenomenon whereby different results are obtained in analysis of the same data grouped into different sets of areal units. It vexes the geographical and spatial analyst almost as much today as it did when first identified by Gehlke and Biehl (1934) or when subsequently popularised by Openshaw and Taylor, 1979, Openshaw and Taylor, 1981. The MAUP has been subdivided into two separate but linked issues. One is the zonation issue, which concerns the effects of the arbitrary nature of the boundary division placed upon the data. The other issue is the scale issue, which can be defined as occurring where the statistical results of an analysis may change as the level of analysis changes. These effects occur because spatial processes generating the observed data may exist at scales and for particular areal units that may be reflected more or less accurately by the boundaries in use. Among other authors, Fotheringham and Wong (1991) have demonstrated these effects for US census data, and Tranmer and Steel (2001) have done so for UK data. See Openshaw (1984) for further discussion of these concepts.
Two analytical techniques are applied in this paper to investigate the processes generating spatial patterns. The first technique is the Multi-level model, or MLM (Jones, 1991). The MLM is based on the recognition that a response variable can be affected by processes occurring at both the individual level and the group level. Thus, the MLM can be used to assess the existence, and estimate the magnitude, of processes that operate at the individual person level, and also one or more grouped level. In the classic applications of MLM in education, the groups may correspond to classes or schools; in the current context, the groups may refer to geographical areas over which spatial processes operate.
The second of these techniques is spatial autocorrelation. This has been identified as highly relevant to the analysis of spatial data, such as data that is available for areal units (see for instance Cliff & Ord, 1973). Spatial autocorrelation has been discussed as a factor in the debate concerning the modifiable areal unit problem (see Openshaw & Taylor, 1979). At its simplest, spatial autocorrelation can be thought of as the correlation of a variable at one place with the same variable at neighbouring places. It exemplifies Tobler’s first law of geography that “everything is related to everything else, but near things are more related than distant things” (Tobler, 1970, p. 236). Goodchild (1986) gives a more detailed treatment.
Spatial autocorrelation can inform analysts about the patterning of areal data. It is logical that spatial autocorrelation and multi-level modelling should be analysed together. Jones (1991, p. 8) states, “the degree of auto-correlation in MLM can loosely be conceived as the ratio of ‘variation at the higher level’ to the ‘total variation at all levels’. A value of zero for a spatial autocorrelation coefficient signifies no auto-correlation, indicating that there is no variation at the higher level”. The work presented here builds on this basis, aiming to find evidence for the spatial processes generating the data under analysis, using a combination of adapted multi-level modelling and spatial autocorrelation techniques. The paper also provides conclusions about the patterns displayed by certain British census variables.
Section snippets
Background, data and theory
Prior to presenting our methods it is necessary to consider the nature of areal units for which spatial data may be provided. There may be processes and effects within areal data that interact in a complex fashion to create the observed data. If data are available at different scales, this may reflect the processes generating the data. However, there may be other processes affecting observed data that occur at scales for which we do not have information. Despite this, they deserve
Methodology
The models and methods described by Tranmer and Steel (2001) only allow for a global measure of homogeneity to be calculated, but do not allow the differing levels of homogeneity within a SAR district to be calculated. Therefore we extend the approach to examine evidence of such changes in homogeneity by attempting to identify processes generating these different levels of homogeneity. Having presented some background to the approach, this section details the method that was used to further
Analysis
The Glasgow SAR district was chosen to test the methodology outlined above, as it was known to be an area in which strong scale effects could be seen. It will be contrasted with the Reigate and Ribble SAR districts, which were identified as less susceptible to MAUP (scale) effects (Manley & Flowerdew, 2003). Reigate was chosen in part because Tranmer and Steel (2001) used it as an example, and Ribble because it was known to include areas of different settlement pattern. The variables used are
Conclusions
It has been shown that although an aggregation level (EDs or wards in our case) is presented as a homogeneous set of areal units, the reality is that an aggregation level may be affected by processes operating at vastly different scales. Two variables have been used, demonstrating that different variables act in different manners. Thus, the processes that operate for certain units are specific to a certain variable. It is clear that it is not possible to define an ideal single census geography
Acknowledgements
The census data used in this study, including the Household Sample of Anonymised Records, are Crown Copyright. They were bought for academic use by the ESRC/JISC/DENI and are held at the Manchester Computing Centre. Digital boundary data for Great Britain were also purchased by ESRC for the academic community. Access was obtained via the UKBORDERS service at the University of Edinburgh. An initial version of this paper was presented at the GISRUK 2003 conference at City University. The authors
References (23)
Local indicators of spatial association—LISA
Geographical Analysis
(1995)- et al.
Spatial autocorrelation
(1973) Census geography
- et al.
Behaviour of regression models under random aggregation
- et al.
The modifiable areal unit problem in multivariate statistical analysis
Environment and Planning A
(1991) - et al.
Certain effects of grouping upon the size of the correlation in census tract material
Journal of the American Statistical Association
(1934) - et al.
Local spatial statistics: an overview
- et al.
New evidence on the modifiable areal unit problem
Multilevel statistical models
(2003)Spatial autocorrelation
(1986)
Spatial data analysis: Theory and practice
Cited by (78)
Hierarchical visualization of geographical areal data with spatial attribute association
2021, Visual InformaticsA graded cluster system to mine virtual stations in free-floating bike-sharing system on multi-scale geographic view
2021, Journal of Cleaner ProductionCitation Excerpt :That means DBSCAN algorithm is better than K-Means algorithm. The modifiable areal unit problem (MAUP) (Gehlke and Biehl, 1934; Openshaw and Taylor, 1979, 1981) is a phenomenon whereby different results are obtained in analysis of the same data grouped into different sets of areal units (Manley et al., 2006). It haunts the geographical and spatial analyst in two aspects.
Fatal Places? Contextual Effects on Infant and Child Mortality in Early Twentieth Century England and Wales
2023, Social Science HistoryDo Attitudes Towards Immigrants Matter? The Subjective Wellbeing of Immigrants in England and Wales and Their Exposure to Non-migrants
2023, European Journal of PopulationUncertainty in Causal Neighborhood Effects: A Multi-Agent Simulation Approach
2023, Leibniz International Proceedings in Informatics, LIPIcs