Abstract
The Singular Value Decomposition (SVD) is a matrix decomposition technique that has been successfully applied for the recovery of blocks of missing values in time series. In order to perform an accurate block recovery, SVD requires the use of highly correlated time series. However, using lowly correlated time series that exhibit shape and/or trend similarities could increase the recovery accuracy. Thus, the latter time series could also be exploited by including them in the recovery process.
In this paper, we compare the accuracy of the Centroid Decomposition (CD) against SVD for the recovery of blocks of missing values using highly and lowly correlated time series. We show that the CD technique better exploits the trend and shape similarity to lowly correlated time series and yields a better recovery accuracy. We run experiments on real world hydrological and synthetic time series to validate our results.
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
The data was kindly provided by HydroloGIS (http://www.hydrologis.edu).
- 2.
The data was kindly provided by Südtiroler Beratungsring (http://www.beratungsring.org).
References
Khayati, M., Böhlen, M.: Rebom: Recovery of blocks of missing values in time series. In: Proceedings of the 2012 ACM International Conference on Management of Data. COMAD 2012, pp. 44–55. Computer Society of India (2012)
Li, M., Bi, W., Kwok, J.T., Lu, B.: Large-scale nyström kernel matrix approximation using randomized SVD. IEEE Trans. Neural Netw. Learn. Syst. 26, 152–164 (2015)
Halko, N., Martinsson, P.G., Tropp, J.A.: Finding structure with randomness: probabilistic algorithms for constructing approximate matrix decompositions. SIAM Rev. 53, 217–288 (2011)
Achlioptas, D., McSherry, F.: Fast computation of low-rank matrix approximations. J. ACM 54 (2007)
Li, L., McCann, J., Pollard, N.S., Faloutsos, C.: Dynammo: mining and summarization of coevolving sequences with missing values. In: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 507–516. Paris, France, 28 June–1 July 2009
Chu, M., Funderlic, R.: The centroid decomposition: relationships between discrete variational decompositions and svds. SIAM J. Matrix Anal. Appl. 23, 1025–1044 (2001)
Khayati, M., Böhlen, M., Gamper, J.: Memory-efficient centroid decomposition for long time series. In: ICDE. pp. 100–111 (2014)
Kolda, T.G., O’Leary, D.P.: A semidiscrete matrix decomposition for latent semantic indexing information retrieval. ACM Trans. Inf. Syst. 16, 322–346 (1998)
Kolda, T.G., O’Leary, D.P.: Algorithm 805: computation and uses of the semidiscretematrix decomposition. ACM Trans. Math. Softw. 26, 415–435 (2000)
Yu, H., Hsieh, C., Si, S., Dhillon, I.S.: Scalable coordinate descent approaches to parallel matrix factorization for recommender systems. In: 12th IEEE International Conference on Data Mining, ICDM 2012, pp. 765–774. Brussels, Belgium, 10–13 December 2012
Gemulla, R., Nijkamp, E., Haas, P.J., Sismanis, Y.: Large-scale matrix factorization with distributed stochastic gradient descent. In: KDD, pp. 69–77 (2011)
Koren, Y., Bell, R.M., Volinsky, C.: Matrix factorization techniques for recommender systems. IEEE Comput. 42, 30–37 (2009)
Balzano, L., Nowak, R., Recht, B.: Online identification and tracking of subspaces from highly incomplete information. CoRR abs/1006.4046 (2010)
Golub, G.H., van Loan, C.F.: Matrix computations, 3rd edn. Johns Hopkins University Press, Baltimore (1996)
Björck, A.: Numerical methods for least squares problems. SIAM (1996)
Griffiths, D.V., Smith, I.M.: Numerical Methods for Engineers. CRC Press, Boca Raton (2006)
Jain, A., Nandakumar, K., Ross, A.: Score normalization in multimodal biometric systems. Pattern Recogn. 38, 2270–2285 (2005)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Khayati, M., Böhlen, M.H., Mauroux, P.C. (2015). Using Lowly Correlated Time Series to Recover Missing Values in Time Series: A Comparison Between SVD and CD. In: Claramunt, C., et al. Advances in Spatial and Temporal Databases. SSTD 2015. Lecture Notes in Computer Science(), vol 9239. Springer, Cham. https://doi.org/10.1007/978-3-319-22363-6_13
Download citation
DOI: https://doi.org/10.1007/978-3-319-22363-6_13
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-22362-9
Online ISBN: 978-3-319-22363-6
eBook Packages: Computer ScienceComputer Science (R0)