Abstract
Although various innovative sensing technologies have been widely employed, data missing in collections of time series occurs frequently, which turns out to be a major menace to precise data analysis. However, many existing missing data prediction approaches either might be infeasible or could be inefficient to predict missing data from multiple time series. To solve this problem, we proposed a novel approach based on the compressive sensing theory and sparse Bayesian learning theory for missing data prediction in coevolving time series. First, we model the problem by designing the corresponding sparse representation basis and measurement matrix. Then, the missing data prediction problem is formulated as the multiple sparse vectors recovery problem. Many simultaneous sparse estimation approaches focus on joint estimation of multiple sparse vectors with a common support from given linear observations, which is however too strict in some real applications. In this paper, largely utilizing the interior patterns of coevolving time series, we design a tuning parameter-free algorithm based on the sparse Bayesian learning, which can simultaneously solve multiple sparse estimation takes without the requirement of auxiliary information. Simulation results demonstrate that our approach can recover the entire time series efficiently using only those data that are not missing, even if, a high ratio of collected data are missing.
Similar content being viewed by others
References
Wu X, Zhu X, Wu GQ, Ding W (2014) Data mining with big data. IEEE Trans Knowl Data Eng 26(1):97–107
Vlahogianni EI, Golias JC (2004) Short-term traffic forecasting: overview of objectives and methods. Transp Res Rev 24(5):533–557
Ruby-Figueroa Ren, Saavedra Jorge, Bahamonde Natalia, Cassano Alfredo (2017) Permeate flux prediction in the ultrafiltration of fruit juices by ARIMA models. J Membr Sci 524:108–116
Lippi M, Bertini M, Frasconi P (2013) Short-term traffic flow forecasting: an experimental comparison of time-series analysis and supervised learning. IEEE Trans Intell Transp Syst 14(2):871–882
Strauman AS, Bianchi FM, Mikalsen KØ (2018) Classification of postoperative surgical site infections from blood measurements with missing data using recurrent neural networks. In: IEEE EMBS international conference on biomedical & health informatics (BHI), pp 307–310. https://doi.org/10.1109/BHI.2018.8333430
Zhong M, Sharma S, Lingras P (2004) Genetically designed models for accurate imputations of missing traffic counts. Transp Res Rec 1879:71–79
Kumar L, Kumar M, Rath SK (2016) Maintainability prediction of web service using support vector machine with various kernel methods. Int J Syst Assur Eng Manag 2:1–18
Baharaeen S, Masud AS (1986) A computer program for time series forecasting using single and double exponential smoothing techniques. Comput Ind Eng 11:151–155
Holt CC (2004) Forecasting seasonals and trends by exponentially weighted moving averages. Int J Forecast 20:5–10
Chen C, Kwon J, Rice J, Skabardonis A, Varaiya P (2003) Detectingerrors and imputing missing data for single-loop surveillance systems. Transp Res Rec J Board 1855:160–167
Al Deek HM, Chandra CVSR (2004) New algorithms for filtering and imputation of real-time and archived dual-loop detector data in I-4 data warehouse. Transp Res Rec J Transp Res Board 1867:116–126
Boyles S (2011) A comparison of interpolation methods for missing traffic volume data. In: Proceedings of the 90th annual meeting of the transportation research board, pp 23–27
Qu L, Li L, Zhang Y, Hu J (2009) PPCA-based missing data imputation for traffic flow volume: a systematical approach. IEEE Trans Intell Transp Syst 10(3):512–522
Li Y, Li Z, Li L, Zhang Y (2013) Comparison on PPCA, KPPCA and MPPCA based missing data imputing for traffic flow. In: Proceedings of IEEE conference on intelligent transportation system, pp 1535–1540
Shi W, Zhu Y, Yu PS (2017) Temporal dynamic matrix factorization for missing data prediction in large scale coevolving time series. IEEE Access 4(99):6719–6732
Cai Y, Tong H, Fan W, Ji P (2015) Fast mining of a network of coevolving time series. In: Proceedings of SIAM international conference data mining, pp 298–306
Si Z, Yu H, Ma Z (2016) Learning deep features for DNA methylation data analysis. IEEE Access 4:2732–2737
Cands E, Romberg J, Tao T (2006) Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information. IEEE Trans Inf Theory 52(2):489–509
Tipping ME (2001) Sparse Bayesian learning and the relevance vector machine. JMLR.org
Babacan SD, Molina R, Katsaggelos AK (2010) Bayesian compressive sensing using laplace priors. IEEE Trans Image Process 19(1):53–63
Andrews DF, Mallows CL (1974) Scale mixtures of normal distributions. J R Stat Soc Ser B (Methodol) 36:99–102
Wipf D, Rao B (2007) An empirical Bayesian strategy for solving the simultaneous sparse approximation problem. IEEE Trans Signal Process 55(7):3704–3716
Tropp JA, Gilbert AC, Strauss MJ (2006) Algorithms for simultaneous sparse approximation. Part I: greedy pursuit. Signal Process 86:572–588
Cotter SF, Rao BD, Engan K, Kreutz-Delgado K (2005) Sparse solutions to linear inverse problems with multiple measurement vectors. IEEE Trans Signal Process 53:2477–2488
Tropp JA, Gilbert AC, Strauss MJ (2006) Algorithms for simultaneous sparse approximation. Part II: convex relaxation. Signal Process 86:589–602
Wipf DP, Rao BD (2007) An empirical Bayesian strategy for solving the simultaneous sparse approximation problem. IEEE Trans Signal Process 55:3704–3716
Zhang Z, Rao BD (2011) Sparse signal recovery with temporally correlated source vectors using sparse Bayesian learning. IEEE J Sel Top Signal Process 5:912–926
Zhang Z, Rao BD (2010) Sparse signal recovery in the presence of correlated multiple measurement vectors. In: Proceedings of ICASSP, Dallas, TX, USA, pp 3986–3989
Prasad R, Murphy CR, Rao BD (2014) Joint approximately sparse channel estimation and data detection in OFDM systems using sparse Bayesian learning. IEEE Trans Signal Process 62(14):3591–3603
Chen Wei (2017) Simultaneous sparse Bayesian learning with partially shared support. IEEE Signal Process Lett 24(10):1641–1645
Tipping ME (2001) Sparse Bayesian learning and the relevance vector machine. J Mach Learn Res 1:211–244
Rhee I, Shin M, Hong S (2009) Mobility traces. http://carwdad.org/ncsu/mobilitymodels/
Samuel M. Intel lab data. http://db.csail.mit.edu
Fonollosa J, Sheik S, Huerta R, Marco S (2015) Reservoir computing compensates slow response of chemosensor arrays exposed to fast varying gas concentrations in continuous monitoring. Sens Actuators B Chem 215:618–629
Wu X, Liu M (2012) In-situ soil moisture sensing: Measurement scheduling and estimation using compressive sensing. In: Proceedings of the 11th ACM international conference on information processing in sensor networks, pp 1–12
Bishop CM (2006) Pattern recognition and machine learning. Springer, Berlin
Author information
Authors and Affiliations
Corresponding author
Additional information
This work is supported by the National Natural Science Foundation of China (61871400 and 61571463) and the Jiangsu Province Natural Science Foundation (BK20171401).
Rights and permissions
About this article
Cite this article
Song, X., Guo, Y., Li, N. et al. A novel approach for missing data prediction in coevolving time series. Computing 101, 1565–1584 (2019). https://doi.org/10.1007/s00607-018-0668-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00607-018-0668-8