Multistate time series imputation using generative adversarial network with applications to traffic data

Li, Haitao; Cao, Qian; Bai, Qiaowen; Li, Zhihui; Hu, Hongyu

doi:10.1007/s00521-022-07961-4

Multistate time series imputation using generative adversarial network with applications to traffic data

Original Article
Published: 23 November 2022

Volume 35, pages 6545–6567, (2023)
Cite this article

Neural Computing and Applications Aims and scope Submit manuscript

Haitao Li¹,
Qian Cao¹,
Qiaowen Bai ORCID: orcid.org/0000-0002-2761-8519¹,
Zhihui Li¹ &
…
Hongyu Hu²

920 Accesses
8 Citations
Explore all metrics

Abstract

Time series missing data is a pervasive problem in many fields, especially in intelligent transportation system, which hinders the application of timing analysis methods and the fine adjustment of control strategies. The prevalent imputation approaches reconstruct missing data with a high accuracy by exploiting a precise distribution model. But the multistate characteristic of time series data and the uncertainty of imputation process increase the difficulty of modeling temporal data distribution and reduce the imputation performance. In this paper, a novel time series generative adversarial imputation network (TGAIN) model is proposed to deal with time series data missing problem. The model combines the advantages of GAN's data distribution modeling and multiple imputation's uncertainty handling. Specifically, the TGAIN network is designed and adversarial trained to learn the multistate distribution of missing time series data. Through the conditional vector constraint and adversarial imputation process, the latent distribution for each missing position under different states can be effectively estimated based on implicit relationships with partial observation information. Then the corresponding multiple imputation strategy is proposed to deal with the uncertainty of imputation process and it can determine the best fill value from the learned distribution. Furthermore, sufficient experiments have been conducted in two real traffic flow datasets. The comparative results show the proposed TGAIN not only has better ability on time series data distribution modeling and imputation uncertainty handling, but also performs more robustly and stability even with the missing rate increases.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A multi-task learning-based generative adversarial network for red tide multivariate time series imputation

Article Open access 07 September 2022

ANODE-GAN: Incomplete Time Series Imputation by Augmented Neural ODE-Based Generative Adversarial Networks

Data Imputation with Adversarial Neural Networks for Causal Discovery from Subsampled Time Series

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Data availability

All datasets and code supporting the findings of this study are available from the corresponding author upon reasonable request.

References

Li Z, Cao Q, Zhao Y et al (2018) Signal cooperative control with traffic supply and demand on a single intersection. IEEE Access 6:54407–54416. https://doi.org/10.1109/ACCESS.2018.2870172
Article Google Scholar
Qu Z, Li H, Li Z et al (2020) Short-term traffic flow forecasting method with M-B-LSTM hybrid network. IEEE Trans Intell Transp Syst. https://doi.org/10.1109/TITS.2020.3009725.Accessed29July
Article Google Scholar
Kalair K, Connaughton C (2021) Anomaly detection and classification in traffic flow data from fluctuations in the flow-density relationship. Transp Res Pt C-Emerg Technol 127:103178. https://doi.org/10.1016/j.trc.2021.103178
Article Google Scholar
Farhangfar A, Kurgan LA, Pedrycz W (2007) A novel framework for imputation of missing values in databases. IEEE Trans Syst Man Cybern Syst 37(5):692–709. https://doi.org/10.1109/TSMCA.2007.902631
Article Google Scholar
Guo Z, Wang Y, Ye H (2019) A data imputation method for multivariate time series based on generative adversarial network. Neurocomputing 360:185–197. https://doi.org/10.1016/j.neucom.2019.06.007
Article Google Scholar
García-Laencina PJ, Sancho-Gómez JL, Figueiras-Vidal AR (2010) Pattern classification with missing data: a review. Neural Comput Appl 19(2):263–282. https://doi.org/10.1007/s00521-009-0295-6
Article Google Scholar
García-Laencina PJ, Sancho-Gómez JL, Figueiras-Vidal AR et al (2009) K nearest neighbours with mutual information for simultaneous classification and missing data imputation. Neurocomputing 72(7–9):1483–1493. https://doi.org/10.1016/j.neucom.2008.11.026
Article Google Scholar
Zhang S (2012) Nearest neighbor selection for iteratively KNN imputation. J Syst Softw 85(11):2541–2552. https://doi.org/10.1016/j.jss.2012.05.073
Article Google Scholar
Kim H, Golub GH, Park H (2005) Missing value estimation for DNA microarray gene expression data: local least squares imputation. Bioinformatics 21(2):187–198. https://doi.org/10.1093/bioinformatics/bth499
Article Google Scholar
Yu Z, Li T, Horng SJ et al (2017) An iterative locally auto-weighted least squares method for microarray missing value estimation. IEEE Trans Nanobiosci 16(1):21–33. https://doi.org/10.1109/TNB.2016.2636243
Article Google Scholar
Buza K, Nanopoulosb A, Nagy G (2015) Nearest neighbor regression in the presence of bad hubs. Knowledge-Based Syst 86:250–260. https://doi.org/10.1016/j.knosys.2015.06.010
Article Google Scholar
Wang G, Lu J, Choi KS et al (2020) A transfer-based additive LS-SVM classifier for handling missing data. IEEE T Cybern 50(2):739–752. https://doi.org/10.1109/TCYB.2018.2872800
Article Google Scholar
Razzaghi T, Roderick O, Safro I et al (2016) Multilevel weighted support vector machine for classification on healthcare data with missing values. PLoS ONE 11(5):e0155119. https://doi.org/10.1371/journal.pone.0155119
Article Google Scholar
Qu L, Li L, Zhang Y et al (2009) PPCA-based missing data imputation for traffic flow volume: a systematical approach. IEEE Trans Intell Transp Syst 10(3):512–522. https://doi.org/10.1109/TITS.2009.2026312
Article Google Scholar
Folch-Fortuny A, Arteaga F, Ferrer A (2015) PCA model building with missing data: new proposals and a comparative study. Chemometrics Intell Lab Syst 146:77–88. https://doi.org/10.1016/j.chemolab.2015.05.006
Article Google Scholar
Yuan X, Han L, Qian S et al (2019) Singular value decomposition based recommendation using imputed data. Knowledge-Based Syst 163:485–494. https://doi.org/10.1016/j.knosys.2018.09.011
Article Google Scholar
Chen X, He Z, Wang J (2018) Spatial-temporal traffic speed patterns discovery and incomplete data recovery via SVD-combined tensor decomposition. Transp Res Pt C-Emerg Technol 86(2018):59–77. https://doi.org/10.1016/j.trc.2017.10.023
Article Google Scholar
Asif MT, Mitrovic N, Garg L et al (2013) Low-dimensional models for missing data imputation in road networks. In: EEE international conference on acoustics, speech and signal processing. IEEE, pp. 3527–3531
Chen X, Wei Z, Li Z et al (2017) Ensemble correlation-based low-rank matrix completion with applications to traffic data imputation. Knowl-Based Syst 132:249–262. https://doi.org/10.1016/j.knosys.2017.06.010
Article Google Scholar
Chen X, Cai Y, Ye Q et al (2018) Graph regularized local self-representation for missing value imputation with applications to on-road traffic sensor data. Neurocomputing 303:47–59. https://doi.org/10.1016/j.neucom.2018.04.029
Article Google Scholar
Chen X, Cai Y, Liu Q et al (2018) Nonconvex l(p)-Norm regularized sparese self-representation for traffic sensor data recovery. IEEE Access 6:24279–24290. https://doi.org/10.1109/ACCESS.2018.2832043
Article Google Scholar
Harel O, Zhou XH (2007) Multiple imputation: review of theory, implementation and software. Stat Med 26(16):3057–3077. https://doi.org/10.1002/sim.2787
Article MathSciNet Google Scholar
Murray JS (2018) Multiple imputation: a review of practical and theoretical findings. Stat Sci 33(2):142–159. https://doi.org/10.1214/18-STS644
Article MathSciNet MATH Google Scholar
Gondara L, Wang L (2018) Mida: multiple imputation using denoising autoencoders. Pacific-asia conference on knowledge discovery and data mining. Springer, Berlin, pp 260–272
Chapter Google Scholar
Enders CK, Mistler SA, Keller BT (2016) Multilevel multiple imputation: a review and evaluation of joint modeling and chained equations imputation. Psychol Methods 21(2):222–240. https://doi.org/10.1037/met0000063
Article Google Scholar
Goodfellow I, Pouget-Abadie J, Mirza M, et al (2014) Generative adversarial nets. In: Advances in neural information processing systems, pp. 2672–2680
Arjovsky M, Chintala S, Bottou L, (2017) Wasserstein generative adversarial networks. In: International conference on machine learning, pp. 214–223
Xu S, Zhu Q, Wang J (2020) Generative image completion with image-to-image translation. Neural Comput Appl 32(11):7333–7345. https://doi.org/10.1007/s00521-019-04253-2
Article Google Scholar
Yang Y, Wang L, Xie D et al (2021) Multi-sentence auxiliary adversarial networks for fine-grained text-to-image synthesis. IEEE Trans Image Process 30:2798–2809. https://doi.org/10.1109/TIP.2021.3055062
Article Google Scholar
Yoon J, Jordon J, Schaar M (2018) GAIN: missing data imputation using generative adversarial nets. In: International conference on machine learning, pp. 5675–5684
Luo Y, Cai X, Zhang Y, et al (2018) Multivariate time series imputation with generative adversarial networks. in: 32nd conference on neural information processing systems (NIPS), 2018, vol.31
Shang C, Palmer A, Sun J et al. (2017) VIGAN: missing view imputation with generative adversarial networks. In: 2017 IEEE International conference on big data (Big Data), pp. 766–775
Lee D, Kim J, Moon W J et al. (2019) CollaGAN: collaborative GAN for missing image data imputation. In: IEEE/CVF conference on computer vision and pattern recognition, pp: 2487–2496
Schafer JL, Olsen MK (1998) Multiple imputation for multivariate missing-data problems: a data analyst’s perspective. Multivariate Behav Res 33(4):545–571. https://doi.org/10.1207/s15327906mbr3304_5
Article Google Scholar
Ni D, Leonard JD (2005) Markov chain monte carlo multiple imputation using bayesian networks for incomplete intelligent transportation systems data, Transp. Res. Record. In: 84th annual meeting of the transportation-research-board. 1935(1):57–67
Nielsen SF (2003) Proper and improper multiple imputation. Int Stat Rev 71(3):593–607
Article MATH Google Scholar
Li D, Li L, Li X et al (2020) Smoothed LSTM-AE: a spatio-temporal deep model for multiple time-series missing imputation. Neurocomputing 411:351–363. https://doi.org/10.1016/j.neucom.2020.05.033
Article Google Scholar
Zhu J, Raghunathan TE (2015) Convergence properties of a sequential regression multiple imputation algorithm. J Am Stat Assoc 110(511):1112–1124. https://doi.org/10.1080/01621459.2014.948117
Article MathSciNet MATH Google Scholar
Yu L, Zhou R, Chen R et al (2022) Missing data preprocessing in credit classification: one-hot encoding or imputation? Emerg Mark Financ Trade 58(2):472–482
Article Google Scholar
Li M, Zhang T, Chen Y et al. (2014) Efficient mini-batch training for stochastic optimization. In: 20th ACM SIGKDD international conference on Knowledge discovery and data mining, pp: 661–670
Kong QJ, Zhao Q, Wei C et al (2013) Efficient traffic state estimation for large-scale urban road networks. IEEE Trans Intell Transp Syst 14(1):398–407. https://doi.org/10.1109/TITS.2012.2218237
Article Google Scholar
Li SCX, Jiang B, Marlin B (2019) MisGAN: Learning from incomplete data with generative adversarial networks. In: International conference on learning representations
Fan J, Chow TWS (2017) Matrix completion by least-square, low-rank, and sparse self-representations. Pattern Recognit 71:290–305. https://doi.org/10.1016/j.patcog.2017.05.013
Article Google Scholar
Gao S, Zhou M, Wang Y et al (2019) Dendritic neuron model with effective learning algorithms for classification, approximation and prediction. IEEE Trans. Neural Netw. Learn. Syst 30(2):601–614. https://doi.org/10.1109/TNNLS.2018.2846646
Article Google Scholar
Wang J, Kumbasar T (2019) Parameter optimization of interval Type-2 fuzzy neural networks based on PSO and BBBC methods. IEEE/CAA J Autom Sinica 6(1):247–257
Article Google Scholar

Download references

Acknowledgements

This research is supported by the National Natural Science Foundation of China (Key Program) (52131202) and the Natural Science Foundation of Jilin Province (20190201107JC). The authors would like to thank the Digital Roadway Interactive Visualization and Evaluation Network (DRIVENet) for providing the traffic volume data used to validate this methodology.

Author information

Authors and Affiliations

College of Transportation, Jilin University, Changchun, 130022, People’s Republic of China
Haitao Li, Qian Cao, Qiaowen Bai & Zhihui Li
College of Automotive Engineering, Jilin University, Changchun, 130022, People’s Republic of China
Hongyu Hu

Authors

Haitao Li
View author publications
You can also search for this author inPubMed Google Scholar
Qian Cao
View author publications
You can also search for this author inPubMed Google Scholar
Qiaowen Bai
View author publications
You can also search for this author inPubMed Google Scholar
Zhihui Li
View author publications
You can also search for this author inPubMed Google Scholar
Hongyu Hu
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Qiaowen Bai.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Li, H., Cao, Q., Bai, Q. et al. Multistate time series imputation using generative adversarial network with applications to traffic data. Neural Comput & Applic 35, 6545–6567 (2023). https://doi.org/10.1007/s00521-022-07961-4

Download citation

Received: 23 November 2021
Accepted: 17 October 2022
Published: 23 November 2022
Issue Date: March 2023
DOI: https://doi.org/10.1007/s00521-022-07961-4

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multistate time series imputation using generative adversarial network with applications to traffic data

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

A multi-task learning-based generative adversarial network for red tide multivariate time series imputation

ANODE-GAN: Incomplete Time Series Imputation by Augmented Neural ODE-Based Generative Adversarial Networks

Data Imputation with Adversarial Neural Networks for Causal Discovery from Subsampled Time Series

Explore related subjects

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now