Skip to main content
Log in

Sparse regression for data-driven deterrence functions in gravity models

  • Original Research
  • Published:
Annals of Operations Research Aims and scope Submit manuscript

Abstract

Gravity models have been one of the mathematical models of choice for trip distribution modeling efforts during many decades. Their simplicity offset their drawbacks, as they usually provide a reasonably good rationale for how goods are distributed in a transportation network with relatively little information. These gravity models, however, rely on the definition of a deterrence function that acts as a counterweight of the levels of supply and demand. This function is usually picked from a series of off-the-shelf available functions that only depend on a handful of parameters that need to be calibrated. Because of the limited off-the shelf options, gravity models lack flexibility in some occasions. In this paper, we tackle the use of sparse regression techniques that can accommodate data more flexibly with a reduced number of terms. Using interregional freight origin–destination data from Spain, we test two alternatives, namely, best subset regression and lasso regression. We show that the first one performs better in finding parsimonious deterrence functions and we attain gravity models that fit the data up to 14.5% better than classical deterrence functions.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Data availability

The data used to perform the numerical experiments in this article are publicly available, as referenced. Moreover, the cleaned data are available as an online resource under the name ESM1.xlsx

References

  • Bertsimas, D., King, A., Mazumder, R. (2016). Best subset selection via a modern optimization lens. The Annals of Statistics, 813–852.

  • Besagni, G., & Borgarello, M. (2020). A bottom-up study on the relationships between transportation expenditures and socio-demographic variables: Evidences from the Italian case study. Travel Behaviour and Society, 19, 151–161.

    Article  Google Scholar 

  • Boyd, S., Boyd, S. P., & Vandenberghe, L. (2004). Convex optimization. Cambridge University Press.

    Book  Google Scholar 

  • Bradley, R. A., & Srivastava, S. S. (1979). Correlation in polynomial regression. The American Statistician, 33(1), 11–14.

    Google Scholar 

  • Cascetta, E., Marzano, V., & Papola, A. (2008). Multi-regional input-output models for freight demand simulation at a national level. Recent developments in transport modelling. Emerald Group Publishing Limited.

    Google Scholar 

  • Celik, H. M., & Guldmann, J.-M. (2007). Spatial interaction modeling of interregional commodity flows. Socio-Economic Planning Sciences, 41(2), 147–162.

    Article  Google Scholar 

  • Duddu, V. R., & Pulugurtha, S. S. (2013). Principle of demographic gravitation to estimate annual average daily traffic: Comparison of statistical and neural network models. Journal of Transportation Engineering, 139(6), 585–595.

    Article  Google Scholar 

  • Elmi, A. M., Badoe, D. A., & Miller, E. J. (1999). Transferability analysis of worktrip-distribution models. Transportation Research Record, 1676(1), 169–176.

    Article  Google Scholar 

  • Farebrother, R. (1974). Algorithm as 79: Gram-Schmidt regression. Journal of the Royal Statistical Society Series C (Applied Statistics), 23(3), 470–476.

    Google Scholar 

  • Farrington, P. A., et al. (2011). Methods for forecasting freight in uncertainty: Time series analysis of multiple factors. (Tech. Rep.). University of Alabama.

  • Fischer, M. M. (2002). Learning in neural spatial interaction models: A statistical perspective. Journal of Geographical Systems, 4(3), 287–299.

    Article  Google Scholar 

  • Fischer, M. M., & Leung, Y. (1998). A genetic-algorithms based evolutionary computational neural network for modelling spatial interaction dataneural network for modelling spatial interaction data. The Annals of Regional Science, 32(3), 437–458.

    Article  Google Scholar 

  • Furness, K. P. (1965). Time function iteration. Traffic Engineering and Control, 7(7), 458–460.

    Google Scholar 

  • Gaines, B. R., Kim, J., & Zhou, H. (2018). Algorithms for fitting the constrained lasso. Journal of Computational and Graphical Statistics, 27(4), 861–871.

    Article  Google Scholar 

  • Hastie, T., Tibshirani, R., & Wainwright, M. (2015). Statistical learning with sparsity: The lasso and generalizations. Taylor & Francis.

    Book  Google Scholar 

  • He, Y., Zhao, Y., & Tsui, K. L. (2020). An adapted geographically weighted LASSO (Ada-GWL) model for predicting subway ridership. Transportation, 48, 1–32.

    Google Scholar 

  • Knudsen, D. C., & Fotheringham, A. S. (1986). Matrix comparison, goodness of fit, and spatial interaction modeling. International Regional Science Review, 10(2), 127–147.

    Article  Google Scholar 

  • Kompil, M., & Celik, H. M. (2013). Modelling trip distribution with fuzzy and genetic fuzzy systems. Transportation Planning and Technology, 36(2), 170–200.

    Article  Google Scholar 

  • Lenormand, M., Bassolas, A., & Ramasco, J. J. (2016). Systematic comparison of trip distribution laws and models. Journal of Transport Geography, 51, 158–169.

    Article  Google Scholar 

  • Li, Y., Wang, H., Zhao, J., & Du, B. (2018). Multisource data-driven modeling method for estimation of intercity trip distribution. Mathematical Problems in Engineering.

  • Marquardt, D. W. (1980). Comment: You should standardize the predictor variables in your regression models. Journal of the American Statistical Association, 75(369), 87–91.

    Google Scholar 

  • Martínez, L. M., & Viegas, J. M. (2013). A new approach to modelling distance decay functions for accessibility assessment in transport studies. Journal of Transport Geography, 26, 87–96.

    Article  Google Scholar 

  • Ministerio de Transportes, M. Y. A. U. (2021). Publicaciones de la encuesta permanente de transporte de mercancías por carretera. https://www.mitma.gob.es/informacion-para-el-ciudadano/informacion-estadistica/transporte/transporte-de-mercancias-por-carretera/publicaciones-encuesta-permanente-transporte-mercancias-por-carretera/2014/encuesta-permanente-transporte-mercancias-carretera-anos2006. Accessed 24 March 2022.

  • Narula, S. C. (1979). Orthogonal polynomial regression. International Statistical Review/Revue Internationale de Statistique, 31–36.

  • Natarajan, B. K. (1995). Sparse approximate solutions to linear systems. SIAM Journal on Computing, 24(2), 227–234.

    Article  Google Scholar 

  • Nocedal, J., & Wright, S. (2006). Numerical optimization. Springer.

    Google Scholar 

  • Openshaw, S. (1998). Neural network, genetic, and fuzzy logic models of spatial interaction. Environment and Planning A, 30(10), 1857–1872.

    Article  Google Scholar 

  • Openshaw, S., & Connolly, C. (1977). Empirically derived deterrence functions for maximum performance spatial interaction models. Environment and Planning A, 9(9), 1067–1079.

    Article  Google Scholar 

  • Ortúzar, J. D. D., & Willumsen, L. G. (2011). Modelling transport. Wiley.

    Book  Google Scholar 

  • Öztürk, F., & Akdeniz, F. (2000). Ill-conditioning and multicollinearity. Linear Algebra and Its Applications, 321(1–3), 295–305.

    Article  Google Scholar 

  • Rubio-Herrero, J., & Muñuzuri, J. (2021). Indirect estimation of interregional freight flows with a real-valued genetic algorithm. Transportation, 48(1), 257–282.

    Article  Google Scholar 

  • Sbai, A., & Ghadi, F. (2017). Impact of aggregation and deterrence function choice on the parameters of gravity model. In Proceedings of the mediterranean symposium on smart city applications (pp. 54–66).

  • Shannon, C. E. (1948). A mathematical theory of communication. The Bell System Technical Journal, 27(3), 379–423.

    Article  Google Scholar 

  • Shao, H., Lam, W. H., Sumalee, A., & Hazelton, M. L. (2015). Estimation of mean and covariance of stochastic multi-class OD demands from classified traffic counts. Transportation Research Part C: Emerging Technologies, 59, 92–110.

    Article  Google Scholar 

  • Sun, S., Huang, R., & Gao, Y. (2012). Network-scale traffic modeling and forecasting with graphical lasso and neural networks. Journal of Transportation Engineering, 138(11), 1358–1367.

    Article  Google Scholar 

  • Suprayitno, H. (2018). Searching the correct and appropriate deterrence function general formula for calculating gravity trip distribution model. IPTEK The Journal of Engineering, 4(3), 17–25.

    Google Scholar 

  • Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological), 58(1), 267–288.

    Google Scholar 

  • Tillema, F., Van Zuilekom, K. M., & Van Maarseveen, M. F. (2006). Comparison of neural networks and gravity models in trip distribution. Computer-Aided Civil and Infrastructure Engineering, 21(2), 104–119.

    Article  Google Scholar 

  • Wang, S., Ji, B., Zhao, J., Liu, W., & Xu, T. (2018). Predicting ship fuel consumption based on lasso regression. Transportation Research Part D: Transport and Environment, 65, 817–824.

    Article  Google Scholar 

  • Wilson, A. (1967). A statistical theory of spatial distribution models. Transportation Research, 1(3), 253–269.

    Article  Google Scholar 

Download references

Funding

This research has been funded by the Consejería de Economía, Conocimiento, Empresas y Universidad of Andalusia (project TRACSINT, P20\(\_\) 01183) within Programme FEDER 2014-2020.

Author information

Authors and Affiliations

Authors

Contributions

JRH: Literature search and review, manuscript writing, statistical tests, and optimization methods. JM: Literature search and review, writing, and analysis of performance of deterrence functions.

Corresponding author

Correspondence to Javier Rubio-Herrero.

Ethics declarations

Conflict of interest

The authors declare that they have not any competing interests relevant to the content of this article.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix A Performance results of best subset regression

Appendix A Performance results of best subset regression

Table 4 SRMSE of best subset regression as a function of the number of added terms
Table 5 SRMSE of off-the-shelf deterrence functions
Table 6 Improvement of best subset regression with respect to other deterrence functions (%)

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Rubio-Herrero, J., Muñuzuri, J. Sparse regression for data-driven deterrence functions in gravity models. Ann Oper Res 323, 153–174 (2023). https://doi.org/10.1007/s10479-023-05227-3

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10479-023-05227-3

Keywords

Navigation