Abstract
Gravity models have been one of the mathematical models of choice for trip distribution modeling efforts during many decades. Their simplicity offset their drawbacks, as they usually provide a reasonably good rationale for how goods are distributed in a transportation network with relatively little information. These gravity models, however, rely on the definition of a deterrence function that acts as a counterweight of the levels of supply and demand. This function is usually picked from a series of off-the-shelf available functions that only depend on a handful of parameters that need to be calibrated. Because of the limited off-the shelf options, gravity models lack flexibility in some occasions. In this paper, we tackle the use of sparse regression techniques that can accommodate data more flexibly with a reduced number of terms. Using interregional freight origin–destination data from Spain, we test two alternatives, namely, best subset regression and lasso regression. We show that the first one performs better in finding parsimonious deterrence functions and we attain gravity models that fit the data up to 14.5% better than classical deterrence functions.
Similar content being viewed by others
Data availability
The data used to perform the numerical experiments in this article are publicly available, as referenced. Moreover, the cleaned data are available as an online resource under the name ESM1.xlsx
References
Bertsimas, D., King, A., Mazumder, R. (2016). Best subset selection via a modern optimization lens. The Annals of Statistics, 813–852.
Besagni, G., & Borgarello, M. (2020). A bottom-up study on the relationships between transportation expenditures and socio-demographic variables: Evidences from the Italian case study. Travel Behaviour and Society, 19, 151–161.
Boyd, S., Boyd, S. P., & Vandenberghe, L. (2004). Convex optimization. Cambridge University Press.
Bradley, R. A., & Srivastava, S. S. (1979). Correlation in polynomial regression. The American Statistician, 33(1), 11–14.
Cascetta, E., Marzano, V., & Papola, A. (2008). Multi-regional input-output models for freight demand simulation at a national level. Recent developments in transport modelling. Emerald Group Publishing Limited.
Celik, H. M., & Guldmann, J.-M. (2007). Spatial interaction modeling of interregional commodity flows. Socio-Economic Planning Sciences, 41(2), 147–162.
Duddu, V. R., & Pulugurtha, S. S. (2013). Principle of demographic gravitation to estimate annual average daily traffic: Comparison of statistical and neural network models. Journal of Transportation Engineering, 139(6), 585–595.
Elmi, A. M., Badoe, D. A., & Miller, E. J. (1999). Transferability analysis of worktrip-distribution models. Transportation Research Record, 1676(1), 169–176.
Farebrother, R. (1974). Algorithm as 79: Gram-Schmidt regression. Journal of the Royal Statistical Society Series C (Applied Statistics), 23(3), 470–476.
Farrington, P. A., et al. (2011). Methods for forecasting freight in uncertainty: Time series analysis of multiple factors. (Tech. Rep.). University of Alabama.
Fischer, M. M. (2002). Learning in neural spatial interaction models: A statistical perspective. Journal of Geographical Systems, 4(3), 287–299.
Fischer, M. M., & Leung, Y. (1998). A genetic-algorithms based evolutionary computational neural network for modelling spatial interaction dataneural network for modelling spatial interaction data. The Annals of Regional Science, 32(3), 437–458.
Furness, K. P. (1965). Time function iteration. Traffic Engineering and Control, 7(7), 458–460.
Gaines, B. R., Kim, J., & Zhou, H. (2018). Algorithms for fitting the constrained lasso. Journal of Computational and Graphical Statistics, 27(4), 861–871.
Hastie, T., Tibshirani, R., & Wainwright, M. (2015). Statistical learning with sparsity: The lasso and generalizations. Taylor & Francis.
He, Y., Zhao, Y., & Tsui, K. L. (2020). An adapted geographically weighted LASSO (Ada-GWL) model for predicting subway ridership. Transportation, 48, 1–32.
Knudsen, D. C., & Fotheringham, A. S. (1986). Matrix comparison, goodness of fit, and spatial interaction modeling. International Regional Science Review, 10(2), 127–147.
Kompil, M., & Celik, H. M. (2013). Modelling trip distribution with fuzzy and genetic fuzzy systems. Transportation Planning and Technology, 36(2), 170–200.
Lenormand, M., Bassolas, A., & Ramasco, J. J. (2016). Systematic comparison of trip distribution laws and models. Journal of Transport Geography, 51, 158–169.
Li, Y., Wang, H., Zhao, J., & Du, B. (2018). Multisource data-driven modeling method for estimation of intercity trip distribution. Mathematical Problems in Engineering.
Marquardt, D. W. (1980). Comment: You should standardize the predictor variables in your regression models. Journal of the American Statistical Association, 75(369), 87–91.
Martínez, L. M., & Viegas, J. M. (2013). A new approach to modelling distance decay functions for accessibility assessment in transport studies. Journal of Transport Geography, 26, 87–96.
Ministerio de Transportes, M. Y. A. U. (2021). Publicaciones de la encuesta permanente de transporte de mercancías por carretera. https://www.mitma.gob.es/informacion-para-el-ciudadano/informacion-estadistica/transporte/transporte-de-mercancias-por-carretera/publicaciones-encuesta-permanente-transporte-mercancias-por-carretera/2014/encuesta-permanente-transporte-mercancias-carretera-anos2006. Accessed 24 March 2022.
Narula, S. C. (1979). Orthogonal polynomial regression. International Statistical Review/Revue Internationale de Statistique, 31–36.
Natarajan, B. K. (1995). Sparse approximate solutions to linear systems. SIAM Journal on Computing, 24(2), 227–234.
Nocedal, J., & Wright, S. (2006). Numerical optimization. Springer.
Openshaw, S. (1998). Neural network, genetic, and fuzzy logic models of spatial interaction. Environment and Planning A, 30(10), 1857–1872.
Openshaw, S., & Connolly, C. (1977). Empirically derived deterrence functions for maximum performance spatial interaction models. Environment and Planning A, 9(9), 1067–1079.
Ortúzar, J. D. D., & Willumsen, L. G. (2011). Modelling transport. Wiley.
Öztürk, F., & Akdeniz, F. (2000). Ill-conditioning and multicollinearity. Linear Algebra and Its Applications, 321(1–3), 295–305.
Rubio-Herrero, J., & Muñuzuri, J. (2021). Indirect estimation of interregional freight flows with a real-valued genetic algorithm. Transportation, 48(1), 257–282.
Sbai, A., & Ghadi, F. (2017). Impact of aggregation and deterrence function choice on the parameters of gravity model. In Proceedings of the mediterranean symposium on smart city applications (pp. 54–66).
Shannon, C. E. (1948). A mathematical theory of communication. The Bell System Technical Journal, 27(3), 379–423.
Shao, H., Lam, W. H., Sumalee, A., & Hazelton, M. L. (2015). Estimation of mean and covariance of stochastic multi-class OD demands from classified traffic counts. Transportation Research Part C: Emerging Technologies, 59, 92–110.
Sun, S., Huang, R., & Gao, Y. (2012). Network-scale traffic modeling and forecasting with graphical lasso and neural networks. Journal of Transportation Engineering, 138(11), 1358–1367.
Suprayitno, H. (2018). Searching the correct and appropriate deterrence function general formula for calculating gravity trip distribution model. IPTEK The Journal of Engineering, 4(3), 17–25.
Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological), 58(1), 267–288.
Tillema, F., Van Zuilekom, K. M., & Van Maarseveen, M. F. (2006). Comparison of neural networks and gravity models in trip distribution. Computer-Aided Civil and Infrastructure Engineering, 21(2), 104–119.
Wang, S., Ji, B., Zhao, J., Liu, W., & Xu, T. (2018). Predicting ship fuel consumption based on lasso regression. Transportation Research Part D: Transport and Environment, 65, 817–824.
Wilson, A. (1967). A statistical theory of spatial distribution models. Transportation Research, 1(3), 253–269.
Funding
This research has been funded by the Consejería de Economía, Conocimiento, Empresas y Universidad of Andalusia (project TRACSINT, P20\(\_\) 01183) within Programme FEDER 2014-2020.
Author information
Authors and Affiliations
Contributions
JRH: Literature search and review, manuscript writing, statistical tests, and optimization methods. JM: Literature search and review, writing, and analysis of performance of deterrence functions.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have not any competing interests relevant to the content of this article.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix A Performance results of best subset regression
Appendix A Performance results of best subset regression
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Rubio-Herrero, J., Muñuzuri, J. Sparse regression for data-driven deterrence functions in gravity models. Ann Oper Res 323, 153–174 (2023). https://doi.org/10.1007/s10479-023-05227-3
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10479-023-05227-3