Skip to main content
Log in

Constrained variants of the gravity model and spatial dependence: model specification and estimation issues

  • Original Article
  • Published:
Journal of Geographical Systems Aims and scope Submit manuscript

Abstract

In this paper, we distinguish three constrained variants of the gravity model of spatial interaction: doubly constrained, production constrained and attraction constrained exponential gravity models. These model variants include origin- and/or destination-specific balancing factors that act as constraints to ensure that the estimated rows and columns of the flow data matrix sum to the observed row and column totals. Because flows are typically counts, the Poisson rather than the normal probability model specification furnishes the appropriate statistical distribution, and parameter estimation can be achieved via Poisson regression. This probability model specification motivates the use of origin and/or destination fixed effects or—under certain conditions—the use of origin- and/or destination-specific random effects for model estimation. The paper establishes theoretical connections between balancing factors, fixed effects represented by binary indicator variables and random effects. The results pertaining to both the doubly and singly constrained cases of spatial interaction are illustrated with an empirical example while accounting for spatial dependence between flows from locations neighbouring both the origins and destinations during estimation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Notes

  1. The terms gravity model and spatial interaction model are often used interchangeably. But they are not the same. Spatial interaction models include not only gravity models, but also similar models that have been derived using powerful methods of entropy maximisation from statistical mechanics (Wilson 1967), or utility maximisation from economic theory (Niedercorn and Bechdolt 1969), and those based on intervening opportunities which can be derived heuristically.

  2. For a discussion of problems that plague empirical implementation of regression-based gravity models, and econometric extensions that have recently appeared in the literature, see LeSage and Fischer (2010). These new models replace the conventional assumption of independence among origin–destination flows with formal approaches that allow for spatial dependence in flow magnitudes. The econometric extensions are based on the assumption of a linear relation between the dependent and the independent variables, and this assumes the dependent variable to be normally distributed.

  3. Trip making is viewed as consisting of four components (see, for example, Fischer 2000): trip generation and attraction (the decision to make a trip and how often); trip distribution in a system of traffic zones; modal split (choice of mode of transport); and trip assignment (choice of route through network). The gravity model is used for trip distribution but is preceded by trip generation and attraction models that provide independent estimates of locational (zonal) trip origins and attractions that subsequently become the “mass” terms of the gravity model. Thus, the definition of the row and column sums of the predicted trip matrix coincides exactly with the definitions of the respective mass terms.

  4. Spatial dependence is also known as network autocorrelation (see Black 1992; Chun 2008; Griffith 2009; Chun and Griffith 2011) even though there are similar differences between both as between spatial dependence and spatial autocorrelation in general.

  5. An alternative formulation to that given in Eq. (1) is \( Y_{ij} = K_{ij} U_{i} V_{j}\;f(d_{ij} )\eta_{ij} + \varepsilon_{ij} \) where \( \varepsilon_{ij} \) reflects the sample error and \( \eta_{ij} \) the specification error. In this case, the stochastic nature of \( Y_{ij} \) derives from assumptions made about the stochastic nature of \( \varepsilon_{ij} \) and \( \eta_{ij} \).

  6. The multiplicative form of the balancing factors A i and B j (Wilson 1967) ensures mathematical tractability in searching for an adequate estimation procedure. Alternatively, Tobler (1983) suggests an additive adjustment scheme, K ij  = A i  + B j , to enforce satisfactorily the conservation rule. Ledent (Ledent 1985) introduces a general functional form that subsumes both the multiplicative (Wilson) and the additive (Tobler) variants.

  7. In the origin constrained and the destination constrained models presented here, the constraints to which these models are subject refer to the full set of n origin or n destination locations. But it is possible to develop models that are only constrained over certain subsets of locations. Such models, which are not considered in this paper, may be found in Wilson (1970).

  8. The notion that separation functions in conventional gravity models work to effectively capture spatial dependence in origin–destination flows has long been challenged. Griffith (2007) provides an historical review of the regional science literature about this topic in which he credits Curry (1972) as the first to conceptualise the problem of spatial dependence in flows.

  9. The constrained gravity model variants are intrinsically nonlinear in their parameters, and thus the application of linear methods leads to biased estimates of these parameters.

  10. In the economics literature, it is often called the RAS procedure.

  11. Independence means that the individual flows from origin i to destination j are independent from each other and that origin–destination flows between any pair of locations are independent from flows between any other pair of locations.

  12. Closely related to this assumption are the assumptions that the set of observations for each origin location has a multinomial distribution, say \( \mathcal{M}\mathcal{N}(Y_{i1} ,Y_{i2} , \ldots ,Y_{in} ;Y_{i \bullet } ) \), or that the set of all observations has a multinomial distribution \( \mathcal{M}\mathcal{N}(Y_{i1} ,Y_{i2} , \ldots ,Y_{nn} ;Y_{ \bullet \bullet } ) \), where \( Y_{i \bullet } \) is the total flow from origin location i, \( Y_{ \bullet \bullet } \) is the overall flow, and n is the number of origin and destination locations. These multinomial distributions can be generated by assuming that the \( Y_{ij} \) (i, j = 1,…,n) are independent Poisson random variables sampled subject to the origin totals \( Y_{i \bullet } \), or the overall total \( Y_{ \bullet \bullet } \), being fixed (Bishop et al. 1975).

  13. One advantage of the use of origin/destination indicator variables in a Poisson regression specification is that they yield individual rather than a single aggregate standard error, and null hypothesis probability estimates for each of the individual values in the two sets of balancing factors. One disadvantage is the amount of time necessary to estimate a GLM containing 2n − 2 indicator variables.

  14. The logarithmic link function is best thought of as being an exponential conditional mean function.

  15. McCullagh and Nelder (1983) prove that the procedure converges to the maximum likelihood solution. Note that zero-observed flows do not require any special treatment.

  16. The equivalence of maximum likelihood estimation with the Poisson assumption and the entropy maximisation solution for a doubly constrained gravity model with origin- and destination-specific balancing factors is well known (see Wilson and Kirkby 1980, p. 310). In the latter case, parameter estimation of a model such as Eq. (1) is obtained by maximising an objective function subject to sets of constraints on the origin and destination totals in combination with some constraint on a general measure of spatial separation in the system of locations (Baxter 1982).

  17. Whether the random effects model variants are appropriate model specifications in spatial research remains controversial. When the random effects gravity models are implemented, the spatial units of observation should be representative of a larger population, and n should potentially be able to get to infinity (see Elhorst 2010 for more details on this issue).

  18. Origin/destination-specific spatial dependence in the random effects estimates motivated the gravity model set forth in LeSage et al. (2007) that formally incorporates spatially structured random effects in place of the zero mean, normally distributed independent random effects.

  19. This correlation differs from that latent in the geographic distributions of the origin and destination variables that are reflected in the balancing factors. Pace et al. (2011) show that spatial dependence in the explanatory variables decreases the ability of filtering to produce unbiased regression parameter estimates.

  20. In the fixed effects case of the doubly constrained gravity model, for example, this takes the form \( E(Y_{ij} ) = \mu_{ij} = U_{i} V_{j} \exp \left[ {\alpha + \sum\nolimits_{h = 1}^{n - 1} {I_{iho} \beta_{ho} + \sum\nolimits_{k = 1}^{n - 1} {I_{jkd} \beta_{kd} - \theta d_{ij} } } } \right]\prod\nolimits_{j \ne i}^{n} {E(Y_{ij} )^{{\rho W_{ij} }} } \) where W ij is the (i,j)th element of an N-by-N spatial weight matrix W and \( \rho \) is a scalar parameter that governs the degree of spatial dependence in origin–destination flows. Lambert et al. (2010) set forth a two-step maximum likelihood estimation approach for a spatial autoregressive Poisson model for count data which would need to be extended to the case of flows involving N observations.

  21. This is an especially valuable approach in situations where the flows are count data, because conventional spatial regression models and software tools are less developed for this data type.

  22. We assume that W is similar to a symmetric matrix so that it has real eigenvalues. If W is not symmetric, then \( \tfrac{1}{2}(W + W^{\prime}) \), which is symmetric by construction, may be used.

  23. If intralocational flows are excluded from an analysis, the N-by-N spatial weight matrix reduces to an n(n–1)-by-n(n–1) one, only marginally impacting upon these eigenvectors when n > 100.

  24. Neighbours may be defined using contiguity or measures of spatial proximity such as cardinal distance (for example, in terms of transportation costs) or ordinal distance (for example, the six nearest neighbours). In the illustrative example in Sect. 6, we use a binary contiguity matrix W n to define W.

  25. The criterion I/I max = 0.5 suggests a restriction of the search over eigenvectors with moderate-to-high spatial autocorrelation.

  26. Pace et al. (2011) demonstrate how using iterative eigenvalue routines on sparse matrices such as W can make filtering feasible for data sets involving a million or more observations, and empirically estimate an operation count on the order of N 1.1.

  27. For details about the data construction, see Fischer et al. (2006).

  28. A 257 and B 257 are the arbitrarily selected balancing factors set to one in each case, to avoid perfect multicollinearity with the intercept term, resulting in an expected intercept of zero and an expected slope of one.

  29. The regression equations describe each set of log-balancing factors as a function of the corresponding fixed effects indicator variables. Error terms are not included here.

  30. A deviance statistic exceeding one indicates that overdispersion is present; that is, the Poisson variance is greater than its mean. Although the existence of overdispersion does not affect the unbiased character of the parameter estimates, their standard errors are underestimated, and hence their significance is unrealistically increased.

  31. Because balancing factors are autoregressive specifications [see Eqs. (1314)], they contain marked spatial dependence by construction. The spatial filter descriptions of these balancing factors rely on eigenvectors of the transformed spatial weight matrix M n W n M n where W n is the n-by-n binary contiguity matrix and M n is the n-by-n projection matrix defined by \( M_{n} = I_{n} - \iota_{n} \,\iota^{\prime}_{n} \,n^{ - 1} . \) Forty-two candidate eigenvectors (for which I/I max > 0.25) are available for constructing spatial filters portraying positive spatial autocorrelation across the European regions. Of these, subsets have been selected with a stepwise regression procedure for constructing spatial filters describing the two sets of balancing factors. The criteria used for selection were statistically significant coefficients at the ten percent level associated with minimisation of the log-likelihood function, which is standard practice.

  32. Of note is that for n larger than about 100, current computer resources do not allow direct calculation of the eigenvectors of W. In order to reduce computational intensity, we followed Griffith (2009) to construct the spatial filter with a linear combination of Kronecker products of pairs of origin and destination eigenvectors. The result of this adjustment is 242 = 576 candidate eigenvectors identified as Kronecker products of the 24 eigenvectors with an I > 0.5 extracted from matrix \( (I - \iota \,\iota^{\prime}\,n^{ - 1} )\,W_{n} \,(I - \iota \,\iota^{\prime}\,n^{ - 1} ) \). With 66,049 observations, five covariates and an intercept term, and 576 candidate eigenvectors, the numerical intensity of the problem solution becomes feasible but is still high.

References

  • Bailey TC, Gatrell AC (1995) Interactive spatial data analysis. Longman, Harlow

    Google Scholar 

  • Baxter M (1982) Similarities in methods of estimating spatial interaction models. Geogr Anal 14(3):267–272

    Article  Google Scholar 

  • Bishop YMM, Fienberg SE, Holland PW (1975) Discrete multivariate analysis. MIT Press, Cambridge

    Google Scholar 

  • Black WR (1992) Network autocorrelation in transport network and flow systems. Geogr Anal 24(3):207–222

    Article  Google Scholar 

  • Bolduc D, Laferrière R, Santarossa G (1995) Spatial autoregressive error components in travel flow models: an application to aggregate mode choice. In: Anselin L, Florax R (eds) New directions in spatial econometrics. Springer, Berlin, pp 96–108

    Chapter  Google Scholar 

  • Cesario FJ (1973) A generalized trip distribution model. J Reg Sci 13(2):233–248

    Article  Google Scholar 

  • Cesario FJ (1977) A new interpretation of the “normalizing” or “balancing factors” of gravity type spatial models. J Soc Econ Plann Sci 11(3):131–136

    Article  Google Scholar 

  • Chun Y (2008) Modeling network autocorrelation within migration flows by eigenvector spatial filtering. J Geogr Syst 10(4):317–344

    Article  Google Scholar 

  • Chun Y, Griffith DA (2011) Modeling network autocorrelation in space-time migration flow data: an eigenvector spatial filtering approach. Ann Assoc Am Geogr 101(3):523–536

    Article  Google Scholar 

  • Curry L (1972) A spatial analysis of gravity flows. Reg Stud 6(2):131–137

    Article  Google Scholar 

  • Davies RB, Guy CM (1987) The statistical modelling of flow data when the Poisson assumption is violated. Geogr Anal 19(4):300–314

    Article  Google Scholar 

  • Elhorst JP (2010) Spatial panel data models. In: Fischer MM, Getis A (eds) Handbook of applied spatial analysis. Springer, Berlin, pp 377–407

    Chapter  Google Scholar 

  • Evans AW (1970) Some properties of trip distribution models. Transp Res 4(1):19–36

    Article  Google Scholar 

  • Fischer MM (2000) Travel demand—theory. In: Polak JB, Heertje A (eds) Analytical transport economics. Edward Elgar, Cheltenham, pp 51–78

    Google Scholar 

  • Fischer MM, Griffith DA (2008) Modeling spatial autocorrelation in spatial interaction data: a comparison of spatial econometric and spatial filtering specifications. J Reg Sci 48(5):969–989

    Article  Google Scholar 

  • Fischer MM, Wang J (2011) Spatial data analysis: models, methods and techniques [Springer Briefs in Regional Science]. Springer, Berlin

    Google Scholar 

  • Fischer MM, Scherngell T, Jansenberger E (2005) Patents, patent citations and the geography of knowledge spillovers in Europe. In: Markowski T (ed) Regional scientists’ tribute to Professor Ryszard Domanski. Polish Academy of Sciences, Committee for Space Economy and Regional Planning, Warsaw, pp 57–75

    Google Scholar 

  • Fischer MM, Scherngell T, Jansenberger E (2006) The geography of knowledge spillovers between high-technology firms in Europe: evidence from a spatial interaction modeling perspective. Geogr Anal 38(3):288–309

    Article  Google Scholar 

  • Fotheringham AS, O’Kelly ME (1989) Spatial interaction models: formulations and applications. Kluwer, Dordrecht

    Google Scholar 

  • Griffith DA (2000) A linear regression solution to the spatial autocorrelation problem. J Geogr Syst 2(2):141–156

    Article  Google Scholar 

  • Griffith DA (2003) Spatial autocorrelation and spatial filtering. Springer, Berlin

    Book  Google Scholar 

  • Griffith DA (2007) Spatial structure and spatial interaction: 25 years later. Rev Reg Stud 37(1):28–38

    Google Scholar 

  • Griffith DA (2009) Modeling spatial autocorrelation in spatial interaction data: empirical evidence from 2002 Germany journey-to-work flows. J Geogr Syst 11(2):117–140

    Article  Google Scholar 

  • Haynes KE, Fotheringham AS (1984) Gravity and spatial interaction models. Sage, Bevery Hills

    Google Scholar 

  • Isard W (1960) Methods of regional analysis. MIT Press, Cambridge

    Google Scholar 

  • Kirby HR (1974) Theoretical requirements for calibrating gravity models. Transp Res 8(1):97–104

    Google Scholar 

  • Lambert DM, Brown JP, Florax RJGM (2010) A two-step estimator for a spatial lag model of counts: theory, small sample performance and an application. Reg Sci Urban Econ 40(4):241–252

    Article  Google Scholar 

  • Ledent J (1985) The doubly constrained model of spatial interaction: a more general formulation. Environ Plann A 17(2):253–262

    Article  Google Scholar 

  • LeSage JP, Fischer MM (2010) Spatial econometric methods for modeling origin-destination flows. In: Fischer MM, Getis A (eds) Handbook of applied spatial analysis. Springer, Berlin, pp 409–433

    Chapter  Google Scholar 

  • LeSage JP, Pace RK (2008) Spatial econometric modeling of origin-destination flows. J Reg Sci 48(5):941–968

    Article  Google Scholar 

  • LeSage JP, Pace RK (2009) Introduction to spatial econometrics. CRC Press, Boca Raton

    Book  Google Scholar 

  • LeSage JP, Fischer MM, Scherngell T (2007) Knowledge spillovers across Europe. Evidence from a Poisson spatial interaction model with spatial effects. Pap Reg Sci 86(3):393–421

    Article  Google Scholar 

  • McCullagh P, Nelder JA (1983) Generalized linear models. Chapman and Hall, London

    Google Scholar 

  • Niedercorn J, Bechdolt B (1969) An economic derivation of the “Gravity Law” of spatial interaction. J Reg Sci 9(2):273–282

    Article  Google Scholar 

  • Pace RK, LeSage JP, Zhu S (2011) Interpretation and computation of estimates from regression models using spatial filtering. Paper presented at the Fifth World Conference of the Spatial Econometrics Association, Toulouse, July 6–8

  • Sen A, Smith T (1995) Gravity models of spatial interaction behavior. Springer, Berlin

    Book  Google Scholar 

  • Tiefelsdorf M (2003) Misspecification in interaction model distance decay relations: a spatial structure effect. J Geogr Syst 5(1):25–50

    Article  Google Scholar 

  • Tiefelsdorf M, Boots B (1995) The exact distribution of Moran’s I. Environ Plann A 27(6):985–999

    Article  Google Scholar 

  • Tobler W (1983) An alternative formulation for spatial interaction modeling. Environ Plann A 15(5):693–703

    Article  Google Scholar 

  • Wilson AG (1967) A statistical theory of spatial distribution models. Transp Res 1(3):253–269

    Article  Google Scholar 

  • Wilson AG (1971) A family of spatial interaction models and associated developments. Environ Plann 3(1):1–32

    Article  Google Scholar 

  • Wilson AG, Kirkby MJ (1980) Mathematics for geographers and planners, 2nd edn. Clarendon Press, Oxford

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Manfred M. Fischer.

Appendix: Results for the estimation of singly constrained random effects specifications

Appendix: Results for the estimation of singly constrained random effects specifications

Because of the large dimensionality of the calculus problem, multivariate integration struggles to properly estimate the random effects terms. Largest values appear to introduce the greatest difficulties. Figure 6 reveals that integration is completely successful between the minimum and roughly 0.5 in our case study. Integration is only partially successful beyond 0.5. Incorrectly calculated random effects constitute about ten percent of the total number of random effects in this case study.

Fig. 6
figure 6

Scatterplot of a the origin log-balancing factor (vertical axis) versus the Poisson regression origin location random effects (horizontal axis); b the destination log-balancing factor (vertical axis) versus the Poisson regression destination location random effects (horizontal axis)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Griffith, D.A., Fischer, M.M. Constrained variants of the gravity model and spatial dependence: model specification and estimation issues. J Geogr Syst 15, 291–317 (2013). https://doi.org/10.1007/s10109-013-0182-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10109-013-0182-7

Keywords

JEL Classification

Navigation