Abstract
Joining two time series on subsequence correlation provides useful information about the synchronization of the time series. However, finding the exact subsequence which are most correlated is an expensive computational task. Although the current efficient exact method, JOCOR, requires O(n 2 lgn), where n is the length of the time series, it is still very time-consuming even for time series datasets with medium length. In this paper, we propose an approximate method, LCS-JOCOR, in order to reduce the runtime of JOCOR. Our proposed method consists of three steps. First, two original time series are transformed into two corresponding strings by PAA transformation and SAX discretization. Second, we apply an algorithm to efficiently find the longest common substrings (LCS) of two strings. Finally, the resulting LCSs are mapped back to the original time series to find the most correlated subsequence by JOCOR method. In comparison to JOCOR, our proposed method performs much faster while high accuracy is guaranteed.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Keogh, E.: The UCR time series classification/clustering homepage (2015). http://www.cs.ucr.edu/~eamonn/time_series_data/
Mueen, A., Hamooni, H., Estrada, T.: Time series join on subsequence correlation. In: Proceedings of ICDM 2014, pp. 450–459 (2014)
Chen, Y., Chen, G., Ooi, B.-C.: Efficient processing of warping time series join of motion capture data. In: Proceedings of ICDE 2009, pp. 1048–1059 (2009)
Keogh, E., Chakrabarti, K., Mehrotra, S., Pazzani, M.: Dimensionality reduction for fast similarity search in large time series databases. Knowl. Inf. Syst. 3(3), 263–286 (2001)
Lin, J., Keogh, E., Lonardi, S., Chiu, B.: A symbolic representation of time series with implications for streaming algorithms. In: Proceedings of 8th ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery, pp. 2–11 (2003)
Lin, Y., McCool, Michael D.: Subseries join: a similarity-based time series match approach. In: Zaki, Mohammed J., Yu, J.X., Ravindran, B., Pudi, V. (eds.) PAKDD 2010. LNCS (LNAI), vol. 6118, pp. 238–245. Springer, Heidelberg (2010). doi:10.1007/978-3-642-13657-3_27
Vinh, V.D., Anh, D.T.: Efficient subsequence join over time series under dynamic time warping. In: Król, D., Madeyski, L., Nguyen, N.T. (eds.) Recent Developments in Intelligent Information and Database Systems. SCI, vol. 642, pp. 41–52. Springer, Cham (2016). doi:10.1007/978-3-319-31277-4_4
Xie, J., Yang, J.: A survey of join processing in data streams. In: Data Streams. Advances in Database Systems, vol. 31, pp. 209–236. Springer, US (2007)
Gusfield, D.: Algorithms on Strings, Trees and Sequences. Computer Science and Computational Biology. Cambridge University Press, New York (1997)
Acknowledgement
We would like to thank Mr. John, a member of Matlab forum, for introducing some valuable ideas on the algorithm for finding LCS of two strings.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering
About this paper
Cite this paper
Vinh, V.D., Chau, N.P., Anh, D.T. (2017). An Efficient Method for Time Series Join on Subsequence Correlation Using Longest Common Substring Algorithm. In: Cong Vinh, P., Tuan Anh, L., Loan, N., Vongdoiwang Siricharoen, W. (eds) Context-Aware Systems and Applications. ICCASA 2016. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 193. Springer, Cham. https://doi.org/10.1007/978-3-319-56357-2_13
Download citation
DOI: https://doi.org/10.1007/978-3-319-56357-2_13
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-56356-5
Online ISBN: 978-3-319-56357-2
eBook Packages: Computer ScienceComputer Science (R0)