Elsevier

Social Networks

Volume 34, Issue 1, January 2012, Pages 6-17
Social Networks

Networks and geography: Modelling community network structures as the outcome of both spatial and network processes

https://doi.org/10.1016/j.socnet.2010.12.001Get rights and content

Abstract

This paper focuses on how to extend the exponential random graph models to take into account the geographical embeddedness of individuals in modelling social networks. We develop a hierarchical set of nested models for spatially embedded social networks, in which, following Butts (2002), an interaction function between tie probability and Euclidean distance between nodes is introduced. The models are illustrated by an empirical example from a study of the role of social networks in understanding spatial clustering in unemployment in Australia. The analysis suggests that a spatial effect cannot solely explain the emergence of organised network structure and it is necessary to include both spatial and endogenous network effects in the model.

Introduction

It is not a new idea that physical distance limits people's capacity to form and maintain relationships. Empirical research on the associations between social and geographic space has occurred in disconnected scientific communities, including human geography, tourism and regional science since at least the 1930s (see summary in Butts, 2011, Carrothers, 1956, Mok et al., 2007). Physical propinquity effects have been demonstrated to occur for different types of relationship and at multiple levels of analysis (Festinger et al., 1950, Merton, 1948, Caplow and Forman, 1950, Blake et al., 1956, Whyte, 1957, Sommer, 1969, Wellman, 1996, Mok et al., 2007, Faust et al., 1999, Axhausen, 2006, Larsen et al., 2006). Furthermore, this relationship appears remarkably robust to advances in technology (internet, phones), transportation (highways, freeways, etc.), and cultural differences (Latane et al., 1995, Carley and Wendt, 1991). Over the last 50 years researchers have consistently argued that human relationships are predominantly “local” and that the probability of a tie diminishes as the distance between actors increases, often following either a power law or an exponential decay function (Brown and Moore, 1970, Freeman and Sunshine, 1976, Irwin and Hughes, 1992, Morrill, 1963, Kleinberg, 2000, Wong et al., 2005, Butts, 2002, Butts and Carley, 2000, Butts et al., 2007). The most recent study (Preciado et al., in press) provides empirical evidence that the log odds of a friendship tie between adolescents decreases smoothly as the logarithm of their distance increases. They have also shown that the strength of distance dependence is negatively related with age and shared meeting places. A number of empirical studies of the features of large-scale, spatially embedded human networks have also been conducted (Bernard et al., 1988, Butts, 2002, Butts and Carley, 2000, Dodds et al., 2003, Killworth and Bernard, 1978, Korte and Milgram, 1970, Liben-Nowell et al., 2005, Milgram, 1967, Travers and Milgram, 1969).

While it has been borne out by the past research that the geographical arrangements of individuals have powerful structuring effects on social relationships and social interactions, there have been relatively few attempts to build explicitly spatial models of social networks and to use these models to understand the way in which social networks are embedded geographically. Models proposed by Butts (2002) and Wong et al. (2005) are notable exceptions and aim to understand how different geographical arrangements of connected actors give rise to a particular social network structure, and so, for example, whether geographical proximity between individuals can explain some of commonly observed properties of social networks, such as clustering, skewed degree distributions, and short average geodesic distances.

Butts (2002) utilised a family of spatial non-directed inhomogeneous Bernoulli graphs to study the empirical relationship between geographical distance and network tie probability. He proposed a model that assumes that ties between individuals are independent of one another conditional on an observed distance structure (Butts, 2002)Pr(X=x|D=d,φd)=i,jB(xij|φ(dij)),where X is an adjacency matrix of network tie variables with xij = 1, if there is a tie between i and j, xij = 0, otherwise; D is a distance matrix with dij the geographical distance between i and j; φd = φ(d) is a distance interaction function, which is defined as a function mapping distances defined on (0,∞) onto tie probabilities in [0,1]; x and d refer to realisations of X and D, respectively; and B is the Bernoulli probability mass function given byB(x,φd)=pdifx=1,1pdifx=0,Here the substantial steps are to identify a choice of distance measure, D, as a function of locations in a physical space, and to specify the distance interaction function, φd, that relates distance to tie probability. Given a distance matrix D and a distance interaction function φd, the spatial Bernoulli graph can be constructed as the outcome of series of independent Bernoulli trials where the probability of each edge between actors i and j is determined probabilistically by the distance between them and the distance interaction function. Butts (2002) argued that this model captures some of the commonly observed structural characteristics of the network, such as a high degree of transitivity and the formation of locally dense clusters.

Butts emphasised that spatial locations can be considered not only in terms of physical locations of actors (for example, residential location) but also in terms of social positions, for example, positions in Blau space, where social locations of individuals are represented in a socio-demographic coordinate system, however, the focus of this paper is particularly on physical locations of individuals represented by “physical distance” or “geographical proximity”.

Wong et al. (2005) also developed a model for networks in which the edge probability between any two nodes is considered to be dependent on the spatial distance between those nodes and demonstrated that this model is also able to capture many commonly observed properties of social networks. They represented social networks through an extension of non-directed Erdös-Rényi random graphs by proposing a step-function relationship between edge probability and spatial distance, i.e.:Pr(xij|dij)=p+pd,ifdijR,pα,ifdij>R,where xij is a network tie; dij is the Euclidian distance between actors i and j; p is the density (i.e. average probability) of the network; pb is the proximity bias, which specifies the sensitivity to geographical proximity; R is the neighbourhood radius within which the proximity bias applies; and α is a correction term, that is required to ensure that the average density remains the same, given a distance matrix, d, for all possible R and pb. It is assumed that the Xij are identical and independent distributed Bernoulli random variables conditional on distance being ≤R or >R, and hence that proximity bias, pb, is the same for all actors.

These models demonstrate how the potential importance of geographical proximity to tie formation processes can be parameterised in models for social networks. The main advantage of these models is that they enable us to explore the parametric function relating social interactions to physical distance and can demonstrate simple regularities in a social network that may be associated with spatial proximity of actors. The models propose different functions. In Butts’ model, tie probabilities vary according to distance via a continuous spatial interaction function, whereas in Wong et al.’s model, there is a simple threshold, or step-function, relating tie probability and distance, with the proximity bias in operation only when distance is below the threshold. In spite of this clear difference between these two models, they are similar in one important respect. These models do not incorporate complex dependences between edges, and explain emergence of social network structure solely in terms of spatial proximities among actors. Yet it may be unrealistic to assume that network ties are conditionally independent entities given spatial proximity and whether there is a network tie between two actors, Ben and Sarah may depend not only on the geographical proximity of Ben and Sarah, but also on whether they have network partners in common. While Butts (2002) and Wong et al. (2005) primarily focus on the cases in which edges are independent given the distance, they both recognize that endogenous social processes may account for some aspects of emergent social structure, so that spatial proximity may be both a by-product as well as a determinant of social structure. They also indicate the necessity of constructing a nested family of exponential random graph models that can be used to evaluate the empirical evidence for spatial and network dependences and demonstrate that both models can be easily translated into the more general exponential family framework with some minor modification. It is worth noting that (to our knowledge) there have been no applications to date of this promising approach.

The first objective of this paper is therefore to formulate models that allow simultaneous estimation of potential spatial and network effects involved in tie formation. We do so by building empirically testable models for network structure that accommodate geographical proximity as well as endogenous network processes in explanations of network structure. In the process of model construction, we utilise exponential random graph models as well as Butts’ (2011) framework for specifying the distance interaction function, a function that describes the relationship between tie probability and the distance between actors. Our second objective is to apply these models to network data gathered using a snowball sampling methodology in a suburban community within Melbourne, Australia, and hence to assess, at the level of a large community, the potentially distinctive roles that spatial proximity and network processes may play in shaping social network structure.

Section snippets

Exponential random graph models

As noted earlier, a random graph or network is represented by a binary matrix X = [Xij] of network tie variables on a node set N. Each possible edge or tie in the network is regarded as a random variable, with xij = 1 if there is an edge from node i to node j, and xij = 0 otherwise. Here, we regard the node set as fixed and possible ties as nondirected (Xij = Xji) and disallow self-ties of the form Xii. The matrix of all network variables is denoted by X, while x = [xij] refers to a realisation of X.

Data

The data used in this example is derived from a survey designed to assess the role of networks and geographical proximity in explaining the distribution of unemployment in a suburban Australian region (see Daraganova, 2008, for details). The survey was a quantitative study using an interviewer-administered survey where participants were recruited via a 2-wave snowball sampling scheme (e.g., Frank, 2005, Frank and Snijders, 1994, Goodman, 1961, Handcock and Gile, 2010). Specifically, a

Analysis and results

The results are presented in two parts. The first part describes the analysis of different forms of the distance interaction function. The second part presents the results of fitting the exponential random graph models with geographical proximity to the empirical data.

Conclusions

The main goal of this study was to formulate models that allow simultaneous estimation of spatial and network effects from observed network data, and, then, to assess these effects simultaneously. The assessment was made using a study designed to assess community network structure as a function of spatial locations of individuals in a suburban Australian region. As a method we utilised the exponential random graph model approach as it allows the relaxation of the assumption of independence

Acknowledgements

We thank Carter Butts for his helpful suggestions and discussion. We would also like to thank Peng Wang for the programming of the simulation and estimation programs.

References (59)

  • C.T. Butts

    Predictability of large-scale spatially embedded networks

  • Butts, C.T., 2000. Spatial Models of Large-Scale Interpersonal Networks, PhD, Carnegie Mellon...
  • C.T. Butts

    Space and Structure: Methods and Models for Large-Scale Interpersonal Networks

    (2011)
  • C.T. Butts

    Predictability of Large-scale Spatially Embedded Networks

  • Butts, C.T., Carley, K., 2000. Spatial Models of Large-Scale Interpersonal Networks. Unpublished...
  • C.T. Butts et al.

    Responder communication networks in the world trade center disaster: implications for modeling of communication within emergency settings

    The Journal of Mathematical Sociology

    (2007)
  • T. Caplow et al.

    Neighbourhood interaction in a homogeneous community

    American Sociological Review

    (1950)
  • K. Carley et al.

    Electronic mail and scientific communication

    Science Communication

    (1991)
  • G.A.P. Carrothers

    An historical review of the gravity and potential concepts of human interaction

    American Institute of Planners

    (1956)
  • Daraganova, G., 2008. Statistical models for social networks and network-mediated social influence processes. Thesis...
  • P.S. Dodds et al.

    An experimental study of search in global social networks

    Science

    (2003)
  • P. Erdos et al.

    On Random graphs. 1

    Publicationes Mathematicae (Debrecen)

    (1959)
  • K. Faust et al.

    Spatial arrangements of social and economic networks among villages in Nang Rong district, Thailand

    Social Networks

    (1999)
  • L. Festinger et al.

    Social Pressures in Informal Groups

    (1950)
  • O. Frank

    Network sampling and Model fitting

  • O. Frank et al.

    Markov graphs

    Journal of American Statistical Association

    (1986)
  • O. Frank et al.

    Estimating the size of hidden populations using snowball sampling

    Journal of Official Statistics

    (1994)
  • L.C. Freeman et al.

    Race and intra-urban migration

    Demography

    (1976)
  • L.A. Goodman

    Snowball sampling

    The Annals of Mathematical Statistics

    (1961)
  • Cited by (0)

    View full text