Abstract
In time series mining, subsequence time series (STS) clustering has been widely used as a subroutine in various mining tasks, e.g., anomaly detection, classification, or rule discovery. STS clustering’s main objective is to cluster similar underlying subsequences together. Other than the known problem of meaninglessness in the STS clustering results, another challenge is on clustering where the subsequence patterns have variable lengths. General approaches provide a solution only to the problems where the range of width variability is small and under some predefined parameters, which turns out to be impractical for real-world data. Thus, we propose a new algorithm that can handle much larger variability in the pattern widths, while providing the parameter-free characteristic, so that the users would no longer suffer from the difficult task of parameter selection. The Minimum Description Length (MDL) principle and motif discovery technique are adopted to be used in determining the proper widths of the subsequences. The experimental results confirm that our proposed algorithm can effectively handle very large width variability of the time series subsequence patterns by outperforming all other recent STS clustering algorithms.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Keogh, E.J., Lin, J., Truppel, W.: Clustering of Time Series Subsequences is Meaningless: Implications for Previous and Future Research. In: Proceedings of the IEEE International Conference on Data Mining (ICDM), pp. 115–122 (2003)
Das, G., Lin, K., Mannila, H., Renganathan, G., Smyth, P.: Rule Discovery from Time Series. In: Proceedings of the 3rd Knowledge Discovery and Data Mining (KDD) (1998)
Zakaria, J., Mueen, A., Keogh, E.: Clustering Time Series Using Unsupervised-Shapelets. In: Proceedings of the IEEE International Conference on Data Mining (ICDM), pp. 785–794 (2012)
Barron, A., Rissanen, J., Yu, B.: The minimum description length principle in coding and modeling. IEEE Transactions on Information Theory 44(6), 2743–2760 (1998)
Fu, T.: A review on time series data mining. Engineering Applications of Artificial Intelligence 24, 164–181 (2011)
Rakthanmanon, T., Keogh, E.J., Lonardi, S., Evans, S.: Time Series Epenthesis: Clustering Time Series Streams Requires Ignoring. In: Proceedings of the 11th IEEE International Conference on Data Mining (ICDM), pp. 547–556 (2011)
Mueen, A., Keogh, E.J., Zhu, Q., Cash, S., Westover, M.B.: Exact Discovery of Time Series Motifs. In: Proceedings of the SIAM International Conference on Data Mining, pp. 473–484 (2009)
Rodpongpun, S., Niennattrakul, V., Ratanamahatana, C.A.: Selective Subsequence Time Series clustering. Knowledge-Based Systems 35, 361–368 (2012)
Keogh, E.J., Xi, X., Wei, L., Ratanamahatana, C.A., The, U.C.R.: The UCR time series classification/clustering homepage (2008), www.cs.ucr.edu/~eamonn/time_series_dat/
Cotofrei, P., Stoffel, K.: Classification Rules + Time = Temporal Rules. In: Sloot, P.M.A., Tan, C.J.K., Dongarra, J., Hoekstra, A.G. (eds.) ICCS-ComputSci 2002, Part I. LNCS, vol. 2329, pp. 572–581. Springer, Heidelberg (2002)
Yingchareonthawornchai, S., Sivaraks,Rodpongpun, S., Ratanamahatana, C.A.: The Proper Length Motif Discovery Algorithm. In: Proceedings of the 16th International Computer Science and Engineering Conference (ICSEC 2012), Chonburi, Thailand (2012)
Madicar, N., Sivaraks, H., Rodpongpun, S., Ratanamahatana, C.A.: Parameter-free subsequences time series clustering with various-width clusters. In: 2013 5th International Conference on Knowledge and Smart Technology (KST), pp. 150–155 (2013)
Niennattrakul, V., Wanichsan, D., Ratanamahatana, C.A.: Accurate Subsequence Matching on Data Stream under Time Warping Distance. In: Theeramunkong, T., Nattee, C., Adeodato, P.J.L., Chawla, N., Christen, P., Lenca, P., Poon, J., Williams, G. (eds.) New Frontiers in Applied Data Mining. LNCS, vol. 5669, pp. 156–167. Springer, Heidelberg (2010)
Wang, S., Gan, W., Li, D., Li, D.: Data Field for Hierarchical Clustering. International Journal of Data Warehousing and Mining archive (IJDWM) 7(4), 43–63 (2011)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Madicar, N., Sivaraks, H., Rodpongpun, S., Ratanamahatana, C.A. (2014). An Enhanced Parameter-Free Subsequence Time Series Clustering for High-Variability-Width Data. In: Herawan, T., Ghazali, R., Deris, M. (eds) Recent Advances on Soft Computing and Data Mining. Advances in Intelligent Systems and Computing, vol 287. Springer, Cham. https://doi.org/10.1007/978-3-319-07692-8_40
Download citation
DOI: https://doi.org/10.1007/978-3-319-07692-8_40
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-07691-1
Online ISBN: 978-3-319-07692-8
eBook Packages: EngineeringEngineering (R0)