Skip to main content
Log in

Spatial–temporal regularized tensor decomposition method for traffic speed data imputation

  • Original Paper
  • Published:
International Journal of Data Science and Analytics Aims and scope Submit manuscript

Abstract

Data missing is very common in the spatial–temporal traffic data collected by various detectors, and how to accurately impute the missing values is particularly important in intelligent transportation systems. Because the method based on tensor decomposition has advantages in solving the problem of multidimensional data imputation, in this paper, we regard the missing traffic speed data imputation as a tensor decomposition problem and propose a three-process framework based on the tensor decomposition of spatial–temporal regularization, which imputes the missing traffic speed data by using the hidden spatial–temporal characteristics and underlying structure. Specifically, we first propose a high-precision initialization method based on the low-rank tensor completion model. The experimental results show that the optimal initialization of tensor decomposition has good imputation performance. Then, we design a threshold and flexibly choose the truncation rank in the truncated higher-order singular value decomposition, to get the core tensor of appropriate size and better capture the characteristics of each dimension. Finally, we apply these features and add regularization term constraints related to the time interval of one day and the location of road detectors, and the missing traffic speed data are estimated by spatial–temporal regularized Tucker decomposition (STRTD). In addition to the scenes of element-like random missing (EM) and fiber-like random missing (FM), our experiment also creates a region-like random missing (RM) by imitating the real-world loss. We have done experiments on real-world traffic speed data sets, and the results show that our STRTD model is better than the most advanced imputation model at present, even in the case of a high missing rate.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Notes

  1. https://github.com/zhiyongc/Seattle-Loop-Data.

References

  1. Asif, M.T., Kannan, S., Dauwels, J., Jaillet, P.: Data compression techniques for urban traffic data. In: 2013 IEEE Symposium on Computational Intelligence in Vehicles and Transportation Systems (CIVTS), pp. 44–49. IEEE, Singapore (2013). https://doi.org/10.1109/CIVTS.2013.6612288

  2. Li, L., Li, Y., Li, Z.: Efficient missing data imputing for traffic flow by considering temporal and spatial dependence. Transp. Res. Part C Emerging Technol. 34, 108–120 (2013)

    Article  Google Scholar 

  3. Gharehchopogh, F.S., Shayanfar, H.: Automatic data clustering using farmland fertility metaheuristic algorithm. In: Advances in Swarm Intelligence: Variations and Adaptations for Optimization Problems, pp. 199–215. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-09835-2_11

  4. Piri, J., Mohapatra, P., Acharya, B., Gharehchopogh, F.S., Gerogiannis, V.C., Kanavos, A., Manika, S.: Feature selection using artificial gorilla troop optimization for biomedical data: A case analysis with covid-19 data. Mathematics 10(15), 2742 (2022). https://doi.org/10.3390/math10152742

    Article  Google Scholar 

  5. Sorkhabi, L.B., Gharehchopogh, F.S., Shahamfar, J.: A systematic approach for pre-processing electronic health records for mining: case study of heart disease. Int. J. Data Min. Bioinform. 24(2), 97–120 (2020). https://doi.org/10.1504/IJDMB.2020.110154

    Article  Google Scholar 

  6. Rahnema, N., Gharehchopogh, F.S.: An improved artificial bee colony algorithm based on whale optimization algorithm for data clustering. Multimedia Tools Appl. 79(43–44), 32169–32194 (2020). https://doi.org/10.1007/s11042-020-09639-2

    Article  Google Scholar 

  7. Gharehchopogh, F.S., Ucan, A., Ibrikci, T., Arasteh, B., Isik, G.: Slime mould algorithm: a comprehensive survey of its variants and applications. Arch. Comput. Methods Eng. (2023). https://doi.org/10.1007/s11831-023-09883-3

    Article  PubMed  PubMed Central  Google Scholar 

  8. Qu, H., Gong, Y., Chen, M., Zhang, J., Zheng, Y., Yin, Y.: Forecasting fine-grained urban flows via spatio-temporal contrastive self-supervision. IEEE Trans. Knowl. Data Eng. (2022). https://doi.org/10.1109/TKDE.2022.3200734

    Article  Google Scholar 

  9. Wang, Y., Zheng, Y., Xue, Y.: Travel time estimation of a path using sparse trajectories. In: Macskassy, S.A., Perlich, C., Leskovec, J., Wang, W., Ghani, R. (eds.) Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 25–34. ACM, New York (2014). https://doi.org/10.1145/2623330.2623656

  10. Gong, Y., Li, Z., Zhang, J., Liu, W., Yin, Y., Zheng, Y.: Missing value imputation for multi-view urban statistical data via spatial correlation learning. IEEE Trans. Knowl. Data Eng. 35(1), 686–698 (2021). https://doi.org/10.1109/TKDE.2021.3072642

    Article  Google Scholar 

  11. Gong, Y., Li, Z., Zhang, J., Liu, W., Chen, B., Dong, X.: A spatial missing value imputation method for multi-view urban statistical data. In: Proceedings of the Twenty-Ninth International Conference on International Joint Conferences on Artificial Intelligence, pp. 1310–1316. IJCAI’20, Yokohama, Japan (2021)

  12. Tan, H., Feng, G., Feng, J., Wang, W., Zhang, Y.-J., Li, F.: A tensor-based method for missing traffic data completion. Transp. Res. Part C Emerging Technol. 28, 15–27 (2013)

    Article  Google Scholar 

  13. Ran, B., Tan, H., Wu, Y., Jin, P.J.: Tensor based missing traffic data completion with spatial-temporal correlation. Physica A 446, 54–63 (2016)

    Article  ADS  Google Scholar 

  14. Asif, M.T., Mitrovic, N., Dauwels, J., Jaillet, P.: Matrix and tensor based methods for missing data estimation in large traffic networks. IEEE Trans. Intell. Transp. Syst. 17(7), 1816–1825 (2016). https://doi.org/10.1109/TITS.2015.2507259

    Article  Google Scholar 

  15. Gong, Y., Li, Z., Zhang, J., Liu, W., Yi, J.: Potential passenger flow prediction: a novel study for urban transportation development. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 4020–4027. AAAI, California, USA (2020). https://doi.org/10.1609/aaai.v34i04.5819

  16. Tan, H., Wu, Y., Shen, B., Jin, P.J., Ran, B.: Short-term traffic prediction based on dynamic tensor completion. IEEE Trans. Intell. Transp. Syst. 17(8), 2123–2133 (2016). https://doi.org/10.1109/TITS.2015.2513411

    Article  Google Scholar 

  17. Acar, E., Dunlavy, D.M., Kolda, T.G., Mørup, M.: Scalable tensor factorizations for incomplete data. IEEE Trans. Intell. Transp. Syst. 106(1), 41–56 (2011)

    CAS  Google Scholar 

  18. Chen, J., Shao, J.: Nearest neighbor imputation for survey data. J. Off. Stat. 16(2), 113 (2000)

    CAS  Google Scholar 

  19. Smith, B.L., Scherer, W.T., Conklin, J.H.: Exploring imputation techniques for missing data in transportation management systems. Transp. Res. Rec. 1836(1), 132–142 (2003)

    Article  Google Scholar 

  20. Smith, B.L., Conklin, J.H.: Use of local lane distribution patterns to estimate missing data values from traffic monitoring systems. Transp. Res. Rec. 1811(1), 50–56 (2002)

    Article  Google Scholar 

  21. Gold, D.L., Turner, S.M., Gajewski, B.J., Spiegelman, C.: Imputing missing values in its data archives for intervals under 5 minutes. In: Transportation Research Board 80th Annual Meeting. ARRB, Washington, D.C., US (2001)

  22. Qu, L., Zhang, Y., Hu, J., Jia, L., Li, L.: A bpca based missing value imputing method for traffic flow volume data. In: 2008 IEEE Intelligent Vehicles Symposium, pp. 985–990. IEEE, Eindhoven, Netherlands (2008). https://doi.org/10.1109/IVS.2008.4621153

  23. Qu, L., Li, L., Zhang, Y., Hu, J.: PPCA-based missing data imputation for traffic flow volume: a systematical approach. IEEE Trans. Intell. Transp. Syst. 10(3), 512–522 (2009)

    Article  Google Scholar 

  24. Tan, H., Wu, Y., Cheng, B., Wang, W., Ran, B.: Robust missing traffic flow imputation considering nonnegativity and road capacity. Math. Probl. Eng. 2014, 1–8 (2014)

    MathSciNet  Google Scholar 

  25. Guo, Y., Wang, X., Wang, M., Zhang, H.: An improved low rank matrix completion method for traffic data. In: 2018 11th International Conference on Intelligent Computation Technology and Automation (ICICTA), pp. 255–260. IEEE, Changsha, China (2018). https://doi.org/10.1109/ICICTA.2018.00064

  26. Silva-Ramírez, E.-L., Pino-Mejías, R., López-Coello, M., Cubiles-de-la-Vega, M.-D.: Missing value imputation on missing completely at random data using multilayer perceptrons. Neural Netw. 24(1), 121–129 (2011)

    Article  PubMed  Google Scholar 

  27. Liu, J., Musialski, P., Wonka, P., Ye, J.: Tensor completion for estimating missing values in visual data. IEEE Trans. Pattern Anal. Mach. Intell. 35(1), 208–220 (2012). https://doi.org/10.1109/TPAMI.2012.39

    Article  Google Scholar 

  28. Ran, B., Tan, H., Feng, J., Wang, W., Cheng, Y., Jin, P.: Estimating missing traffic volume using low multilinear rank tensor completion. J. Intell. Transp. Syst. 20(2), 152–161 (2016). https://doi.org/10.1080/15472450.2015.1015721

    Article  Google Scholar 

  29. Goulart, J.d.M., Kibangou, A., Favier, G.: Traffic data imputation via tensor completion based on soft thresholding of tucker core. Transp. Res. Part C Emerging Technol. 85, 348–362 (2017)

  30. Chen, X., Lei, M., Saunier, N., Sun, L.: Low-rank autoregressive tensor completion for spatiotemporal traffic data imputation. IEEE Trans. Intell. Transp. Syst. 23(8), 12301–12310 (2021). https://doi.org/10.1109/TITS.2021.3113608

    Article  Google Scholar 

  31. Wang, X., Wu, Y., Zhuang, D., Sun, L.: Low-rank Hankel tensor completion for traffic speed estimation. arXiv e-prints, 2105-11335 (2021) arXiv:2105.11335 [cs.LG]

  32. Zhao, Q., Zhang, L., Cichocki, A.: Bayesian CP factorization of incomplete tensors with automatic rank determination. IEEE Trans. Pattern Anal. Mach. Intell. 37(9), 1751–1763 (2015). https://doi.org/10.1109/TPAMI.2015.2392756

    Article  PubMed  Google Scholar 

  33. Tucker, L.R.: Some mathematical notes on three-mode factor analysis. Psychometrika 31(3), 279–311 (1966)

    Article  MathSciNet  CAS  PubMed  Google Scholar 

  34. Schifanella, C., Candan, K.S., Sapino, M.L.: Multiresolution tensor decompositions with mode hierarchies. ACM Trans. Knowl. Discov. Data (TKDD) 8(2), 1–38 (2014)

    Article  Google Scholar 

  35. Carroll, J.D., Chang, J.-J.: Analysis of individual differences in multidimensional scaling via an n-way generalization of “eckart-young’’ decomposition. Psychometrika 35(3), 283–319 (1970)

    Article  Google Scholar 

  36. Kolda, T.G., Bader, B.W.: Tensor decompositions and applications. SIAM Rev. 51(3), 455–500 (2009). https://doi.org/10.1137/07070111X

    Article  ADS  MathSciNet  Google Scholar 

  37. Wu, Y., Tan, H., Li, Y., Li, F., He, H.: Robust tensor decomposition based on Cauchy distribution and its applications. Neurocomputing 223, 107–117 (2017). https://doi.org/10.1016/j.neucom.2016.10.030

    Article  Google Scholar 

  38. Chen, X., He, Z., Wang, J.: Spatial-temporal traffic speed patterns discovery and incomplete data recovery via SVD-combined tensor decomposition. Transp. Res. Part C Emerg. Technol. 86, 59–77 (2018). https://doi.org/10.1016/j.neunet.2010.09.008

    Article  Google Scholar 

  39. Chen, X., He, Z., Sun, L.: A Bayesian tensor decomposition approach for spatiotemporal traffic data imputation. Transp. Res. Part C Emerg. Technol. 98, 73–84 (2019). https://doi.org/10.1016/j.trc.2018.11.003

    Article  Google Scholar 

  40. Duan, Y., Lv, Y., Liu, Y.-L., Wang, F.-Y.: An efficient realization of deep learning for traffic data imputation. Transp. Res. Part C Emerg. Technol. 72, 168–181 (2016). https://doi.org/10.1016/j.trc.2016.09.015

    Article  Google Scholar 

  41. Zhang, Z., Lin, X., Li, M., Wang, Y.: A customized deep learning approach to integrate network-scale online traffic data imputation and prediction. Transp. Res. Part C Emerg. Technol. 132, 103372 (2021). https://doi.org/10.1016/j.trc.2021.103372

    Article  Google Scholar 

  42. Han, Y., Moutarde, F.: Analysis of large-scale traffic dynamics in an urban transportation network using non-negative tensor factorization. Int. J. Intell. Transp. Syst. Res. 14(1), 36–49 (2016). https://doi.org/10.1007/s13177-014-0099-7

    Article  Google Scholar 

  43. Li, X., Li, M., Gong, Y.-J., Zhang, X.-L., Yin, J.: T-DesP: destination prediction based on big trajectory data. IEEE Trans. Intell. Transp. Syst. 17(8), 2344–2354 (2016). https://doi.org/10.1109/TITS.2016.2518685

    Article  Google Scholar 

  44. Asif, M.T., Srinivasan, K., Mitrovic, N., Dauwels, J., Jaillet, P.: Near-lossless compression for large traffic networks. IEEE Trans. Intell. Transp. Syst. 16(4), 1817–1826 (2014). https://doi.org/10.1109/TITS.2014.2374335

    Article  Google Scholar 

  45. Sun, L., Axhausen, K.W.: Understanding urban mobility patterns with a probabilistic tensor factorization framework. Transp. Res. Part B Methodol. 91, 511–524 (2016). https://doi.org/10.1016/j.trb.2016.06.011

    Article  Google Scholar 

  46. De Lathauwer, L., De Moor, B., Vandewalle, J.: A multilinear singular value decomposition. SIAM J. Matrix Anal. Appl. 21(4), 1253–1278 (2000). https://doi.org/10.1137/S0895479896305696

    Article  MathSciNet  Google Scholar 

  47. Golub, G.H., Van Loan, C.F.: Matrix Computations. JHU Press, Baltimore (2013)

    Book  Google Scholar 

  48. Liu, J., Musialski, P., Wonka, P., Ye, J.: Tensor completion for estimating missing values in visual data. IEEE Trans. Pattern Anal. Mach. Intell. 35(1), 208–220 (2012). https://doi.org/10.1109/TPAMI.2012.39

    Article  Google Scholar 

  49. Chen, X., Yang, J., Sun, L.: A nonconvex low-rank tensor completion model for spatiotemporal traffic data imputation. Transp. Res. Part C Emerg. Technol. 117, 102673 (2020). https://doi.org/10.1016/j.trc.2020.102673

    Article  Google Scholar 

  50. Hu, Y., Zhang, D., Ye, J., Li, X., He, X.: Fast and accurate matrix completion via truncated nuclear norm regularization. IEEE Trans. Pattern Anal. Mach. Intell. 35(9), 2117–2130 (2012). https://doi.org/10.1109/TPAMI.2012.271

    Article  Google Scholar 

  51. Chen, B., Li, Z., Zhang, S.: On optimal low rank tucker approximation for tensors: the case for an adjustable core size. J. Glob. Optim. 62(4), 811–832 (2015). https://doi.org/10.1007/s10898-014-0231-x

    Article  MathSciNet  Google Scholar 

  52. Deng, D., Shahabi, C., Demiryurek, U., Zhu, L., Yu, R., Liu, Y.: Latent space model for road networks to predict time-varying traffic. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1525–1534. ACM, New York, NY, USA (2016). https://doi.org/10.1145/2939672.2939860

  53. Lambiotte, R., Delvenne, J.-C., Barahona, M.: Laplacian dynamics and multiscale modular structure in networks. arXiv e-prints, 0812-1770 (2008). arXiv:0812.1770 [physics.soc-ph]

  54. Gong, Y., Li, Z., Zhang, J., Liu, W., Zheng, Y.: Online spatio-temporal crowd flow distribution prediction for complex metro system. IEEE Trans. Knowl. Data Eng. 34(2), 865–880 (2020). https://doi.org/10.1109/TKDE.2020.2985952

    Article  Google Scholar 

  55. Chen, X., He, Z., Chen, Y., Lu, Y., Wang, J.: Missing traffic data imputation and pattern discovery with a Bayesian augmented tensor factorization model. Transp. Res. Part C Emerg. Technol. 104, 66–77 (2019). https://doi.org/10.1016/j.trc.2019.03.003

    Article  Google Scholar 

  56. Nie, X., Peng, J., Wu, Y., Gupta, B.B., El-Latif, A.A.A.: Real-time traffic speed estimation for smart cities with spatial temporal data: a gated graph attention network approach. Big Data Res. 28, 100313 (2022). https://doi.org/10.1016/j.bdr.2022.100313

    Article  Google Scholar 

  57. Liu, J., Ong, G.P., Chen, X.: Graphsage-based traffic speed forecasting for segment network with sparse data. IEEE Trans. Intell. Transp. Syst. 23(3), 1755–1766 (2022). https://doi.org/10.1109/TITS.2020.3026025

    Article  Google Scholar 

  58. Meng, X., Fu, H., Peng, L., Liu, G., Yu, Y., Wang, Z., Chen, E.: D-LSTM: Short-term road traffic speed prediction model based on GPS positioning data. IEEE Trans. Intell. Transp. Syst. 23(3), 2021–2030 (2022). https://doi.org/10.1109/TITS.2020.3030546

    Article  Google Scholar 

  59. Lin, Z., Liu, R., Su, Z.: Linearized alternating direction method with adaptive penalty for low-rank representation. In: Advances in Neural Information Processing Systems, vol. 24 (2011)

Download references

Funding

This paper was supported by the National Natural Science Foundation of China (62076143, 62202270), in part by the Shandong Excellent Young Scientists Fund (Oversea) (2022HWYQ-044), and in part by the Fundamental research promotion plan of Qilu University of Technology (Shandong Academy of Sciences) (No. 2021JC02009).

Author information

Authors and Affiliations

Authors

Contributions

XD conceived of the presented idea. HX wrote the main manuscript text and performed the experiments. HX completed all the charts and tables. YG further improved the experiment and revised the manuscript. XD made a comprehensive revision of the manuscript. All authors reviewed the manuscript.

Corresponding author

Correspondence to Xiangjun Dong.

Ethics declarations

Conflicts of interest

On behalf of all authors, the corresponding author declares that there is no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Xie, H., Gong, Y. & Dong, X. Spatial–temporal regularized tensor decomposition method for traffic speed data imputation. Int J Data Sci Anal 17, 203–223 (2024). https://doi.org/10.1007/s41060-023-00412-w

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s41060-023-00412-w

Keywords

Navigation