Scales, levels and processes: Studying spatial patterns of British census variables

doi:10.1016/j.compenvurbsys.2005.08.005

Computers, Environment and Urban Systems

Volume 30, Issue 2, March 2006, Pages 143-160

https://doi.org/10.1016/j.compenvurbsys.2005.08.005 Get rights and content

Abstract

This paper is based on the assumption that there may be scale effects at all levels of areal data and that they vary both within areal units and between areal units. Spatial distributions are based on processes taking place in geographical space. A mapped pattern may reflect several distinct processes, each of which may affect a different area and operate at a different scale. The challenge for the spatial analyst is to identify these processes and evaluate their importance from the spatial pattern observed. Here the well known modifiable areal unit problem is not really a problem but a resource. Data at different scales can help us identify processes operating at different scales. We build on models and methods described by [Tranmer, M., & Steel, D. G. (2001). Using local census data to investigate scale effects. In N. J. Tate, & P. M. Atkinson (Eds.), Modelling scale in geographical information science (pp. 105–122). Chichester: John Wiley and Sons], which facilitate the identification of processes occurring within areal units. The method is extended using concepts from multi-level modelling and spatial autocorrelation, through the application of local statistics applied to what may be termed area effect estimates. It is illustrated with respect to two very different census variables and three different study areas.

Introduction

The modifiable areal unit problem (MAUP) is a phenomenon whereby different results are obtained in analysis of the same data grouped into different sets of areal units. It vexes the geographical and spatial analyst almost as much today as it did when first identified by Gehlke and Biehl (1934) or when subsequently popularised by Openshaw and Taylor, 1979, Openshaw and Taylor, 1981. The MAUP has been subdivided into two separate but linked issues. One is the zonation issue, which concerns the effects of the arbitrary nature of the boundary division placed upon the data. The other issue is the scale issue, which can be defined as occurring where the statistical results of an analysis may change as the level of analysis changes. These effects occur because spatial processes generating the observed data may exist at scales and for particular areal units that may be reflected more or less accurately by the boundaries in use. Among other authors, Fotheringham and Wong (1991) have demonstrated these effects for US census data, and Tranmer and Steel (2001) have done so for UK data. See Openshaw (1984) for further discussion of these concepts.

Two analytical techniques are applied in this paper to investigate the processes generating spatial patterns. The first technique is the Multi-level model, or MLM (Jones, 1991). The MLM is based on the recognition that a response variable can be affected by processes occurring at both the individual level and the group level. Thus, the MLM can be used to assess the existence, and estimate the magnitude, of processes that operate at the individual person level, and also one or more grouped level. In the classic applications of MLM in education, the groups may correspond to classes or schools; in the current context, the groups may refer to geographical areas over which spatial processes operate.

The second of these techniques is spatial autocorrelation. This has been identified as highly relevant to the analysis of spatial data, such as data that is available for areal units (see for instance Cliff & Ord, 1973). Spatial autocorrelation has been discussed as a factor in the debate concerning the modifiable areal unit problem (see Openshaw & Taylor, 1979). At its simplest, spatial autocorrelation can be thought of as the correlation of a variable at one place with the same variable at neighbouring places. It exemplifies Tobler’s first law of geography that “everything is related to everything else, but near things are more related than distant things” (Tobler, 1970, p. 236). Goodchild (1986) gives a more detailed treatment.

Spatial autocorrelation can inform analysts about the patterning of areal data. It is logical that spatial autocorrelation and multi-level modelling should be analysed together. Jones (1991, p. 8) states, “the degree of auto-correlation in MLM can loosely be conceived as the ratio of ‘variation at the higher level’ to the ‘total variation at all levels’. A value of zero for a spatial autocorrelation coefficient signifies no auto-correlation, indicating that there is no variation at the higher level”. The work presented here builds on this basis, aiming to find evidence for the spatial processes generating the data under analysis, using a combination of adapted multi-level modelling and spatial autocorrelation techniques. The paper also provides conclusions about the patterns displayed by certain British census variables.

Section snippets

Background, data and theory

Prior to presenting our methods it is necessary to consider the nature of areal units for which spatial data may be provided. There may be processes and effects within areal data that interact in a complex fashion to create the observed data. If data are available at different scales, this may reflect the processes generating the data. However, there may be other processes affecting observed data that occur at scales for which we do not have information. Despite this, they deserve

Methodology

The models and methods described by Tranmer and Steel (2001) only allow for a global measure of homogeneity to be calculated, but do not allow the differing levels of homogeneity within a SAR district to be calculated. Therefore we extend the approach to examine evidence of such changes in homogeneity by attempting to identify processes generating these different levels of homogeneity. Having presented some background to the approach, this section details the method that was used to further

Analysis

The Glasgow SAR district was chosen to test the methodology outlined above, as it was known to be an area in which strong scale effects could be seen. It will be contrasted with the Reigate and Ribble SAR districts, which were identified as less susceptible to MAUP (scale) effects (Manley & Flowerdew, 2003). Reigate was chosen in part because Tranmer and Steel (2001) used it as an example, and Ribble because it was known to include areas of different settlement pattern. The variables used are

Conclusions

It has been shown that although an aggregation level (EDs or wards in our case) is presented as a homogeneous set of areal units, the reality is that an aggregation level may be affected by processes operating at vastly different scales. Two variables have been used, demonstrating that different variables act in different manners. Thus, the processes that operate for certain units are specific to a certain variable. It is clear that it is not possible to define an ideal single census geography

Acknowledgements

The census data used in this study, including the Household Sample of Anonymised Records, are Crown Copyright. They were bought for academic use by the ESRC/JISC/DENI and are held at the Manchester Computing Centre. Digital boundary data for Great Britain were also purchased by ESRC for the academic community. Access was obtained via the UKBORDERS service at the University of Edinburgh. An initial version of this paper was presented at the GISRUK 2003 conference at City University. The authors

References (23)

L. Anselin
Local indicators of spatial association—LISA
Geographical Analysis
(1995)
A. Cliff et al.
Spatial autocorrelation
(1973)
C. Denham
Census geography
R. Flowerdew et al.
Behaviour of regression models under random aggregation
A.S. Fotheringham et al.
The modifiable areal unit problem in multivariate statistical analysis
Environment and Planning A
(1991)
C.E. Gehlke et al.
Certain effects of grouping upon the size of the correlation in census tract material
Journal of the American Statistical Association
(1934)
A. Getis et al.
Local spatial statistics: an overview
M. Green et al.
New evidence on the modifiable areal unit problem
H. Goldstein
Multilevel statistical models
(2003)
M.F. Goodchild
Spatial autocorrelation
(1986)

R. Haining

Spatial data analysis: Theory and practice

(2003)

Cited by (78)

You are where you live? Evaluating the racial and ethnic (mis)representation in geodemographic classification
2024, Applied Geography
Geodemographic classification, a process of categorizing neighborhoods into distinct groups based on their demographic, social, and economic characteristics to create summary profiles, has significantly expanded its applications over the last forty years, from its origins in urban sociology to fields such as health, transportation, and public policy. However, a fundamental issue associated with this classification is that decisions made based on neighborhood profiles are essentially applied not only to the geographic areas but to all residents, which risks marginalizing individuals and households whose characteristics deviate from the established profiles, leading to inequitable treatment in decision-making. This study evaluates whether neighborhood profiles resulting from geodemographic classification serve as fair representations of diverse racial and ethnic subgroups within these neighborhoods. The findings demonstrate that geodemographic classification often yields disparate representations for different racial and ethnic groups in different areas. Individuals and households that are not in the predominant racial and ethnic subgroups within a particular neighborhood consistently experience underrepresentation across a range of demographic, social, and economic domains. This highlights the need for a paradigm shift in neighborhood studies and a people-focused, individual-based approach to the dynamics of neighborhoods.
Hierarchical visualization of geographical areal data with spatial attribute association
2021, Visual Informatics
Geographical areal data usually presents hierarchical structures, and its characteristics vary at different scales. At the higher scales, the visualization of geographical areal data is abstract and the detailed features are easily missed. As a difference, more detailed information is presented at the lower scales while the visual perception of global features is easily disturbed due to the overdrawing of visual elements. As the geographical areal data is visualized at a single scale at the same time, it seems impossible to balance the visual perception of both the global features and detailed characteristics. In this paper, we propose a multi-scale geographical areal data visualization method based on spatial attribute association to enhance the visual perception of both the global features and detailed characteristics. Firstly, the geographical areal data is aggregated into hierarchical clusters based on the spatial similarity. Then, the coefficient of variation is applied to estimate the attribute distribution of each cluster in the hierarchy, and a novel geographical areal data visualization scheme is proposed to adaptively present the multi-scale clusters with lower variation coefficients at the same time. In addition, a rich set of visual interfaces and user-friendly interactions are provided enabling users to specify those clusters of interest at different scales and compare multi-scale visualizations with different hierarchies. Finally, we implement a geographical areal data visualization framework, allowing users to visually explore the global features and detailed characteristics at the same time and get deeper insights into the potential features in the geographical areal data. Case studies and quantitative comparisons based on real-world datasets have been conducted to demonstrate the effectiveness of the proposed multi-scale visualization method for in-depth visual exploration of geographical areal data.
A graded cluster system to mine virtual stations in free-floating bike-sharing system on multi-scale geographic view
2021, Journal of Cleaner Production
Citation Excerpt :
That means DBSCAN algorithm is better than K-Means algorithm. The modifiable areal unit problem (MAUP) (Gehlke and Biehl, 1934; Openshaw and Taylor, 1979, 1981) is a phenomenon whereby different results are obtained in analysis of the same data grouped into different sets of areal units (Manley et al., 2006). It haunts the geographical and spatial analyst in two aspects.
Bike sharing is one of the means of green travel. On the one hand, sharing bicycles facilitates people’s travel and enriches the way of travel. On the other hand, sharing bicycles improves people’s awareness and sense of responsibility for energy saving and emission reduction, which in turn reflects the meaning of green travel. In order to support people using shared bicycles for green travel, it is very important to optimize the configuration of bike-sharing system. The increasing amounts of free-floating bicycles give a serious challenge in parking planning. Dispatching random parking bicycles will increase vehicles workload and offset emission reduction effect of the bicycles. At this paper we discuss about the virtual stations in a free-floating bike-sharing system. Virtual stations are the focus of efficient and environmentally friendly dispatching, which enable the bike-sharing company to achieve cleaner production. We introduce the modifiable areal unit problem and propose a new method called the graded cluster system to mine virtual stations on a multiscale geographic view. The new method can generate more virtual stations more evenly. This not only makes it more convenient for people to use shared bicycles for green travel, but also dense virtual stations can make shared bicycles more visible in people’s sight, strengthening people’s awareness of green travel.
Fatal Places? Contextual Effects on Infant and Child Mortality in Early Twentieth Century England and Wales
2023, Social Science History
Do Attitudes Towards Immigrants Matter? The Subjective Wellbeing of Immigrants in England and Wales and Their Exposure to Non-migrants
2023, European Journal of Population
Uncertainty in Causal Neighborhood Effects: A Multi-Agent Simulation Approach
2023, Leibniz International Proceedings in Informatics, LIPIcs

View all citing articles on Scopus

View full text

Scales, levels and processes: Studying spatial patterns of British census variables

Abstract

Introduction

Section snippets

Background, data and theory

Methodology

Analysis

Conclusions

Acknowledgements

Local indicators of spatial association—LISA

Geographical Analysis

Spatial autocorrelation

Census geography

Behaviour of regression models under random aggregation

The modifiable areal unit problem in multivariate statistical analysis

Environment and Planning A

Certain effects of grouping upon the size of the correlation in census tract material

Journal of the American Statistical Association

Local spatial statistics: an overview

New evidence on the modifiable areal unit problem

Multilevel statistical models

Spatial autocorrelation

Spatial data analysis: Theory and practice