Skip to main content

An Enhanced Parameter-Free Subsequence Time Series Clustering for High-Variability-Width Data

  • Conference paper
Recent Advances on Soft Computing and Data Mining

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 287))

  • 1547 Accesses

Abstract

In time series mining, subsequence time series (STS) clustering has been widely used as a subroutine in various mining tasks, e.g., anomaly detection, classification, or rule discovery. STS clustering’s main objective is to cluster similar underlying subsequences together. Other than the known problem of meaninglessness in the STS clustering results, another challenge is on clustering where the subsequence patterns have variable lengths. General approaches provide a solution only to the problems where the range of width variability is small and under some predefined parameters, which turns out to be impractical for real-world data. Thus, we propose a new algorithm that can handle much larger variability in the pattern widths, while providing the parameter-free characteristic, so that the users would no longer suffer from the difficult task of parameter selection. The Minimum Description Length (MDL) principle and motif discovery technique are adopted to be used in determining the proper widths of the subsequences. The experimental results confirm that our proposed algorithm can effectively handle very large width variability of the time series subsequence patterns by outperforming all other recent STS clustering algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Keogh, E.J., Lin, J., Truppel, W.: Clustering of Time Series Subsequences is Meaningless: Implications for Previous and Future Research. In: Proceedings of the IEEE International Conference on Data Mining (ICDM), pp. 115–122 (2003)

    Google Scholar 

  2. Das, G., Lin, K., Mannila, H., Renganathan, G., Smyth, P.: Rule Discovery from Time Series. In: Proceedings of the 3rd Knowledge Discovery and Data Mining (KDD) (1998)

    Google Scholar 

  3. Zakaria, J., Mueen, A., Keogh, E.: Clustering Time Series Using Unsupervised-Shapelets. In: Proceedings of the IEEE International Conference on Data Mining (ICDM), pp. 785–794 (2012)

    Google Scholar 

  4. Barron, A., Rissanen, J., Yu, B.: The minimum description length principle in coding and modeling. IEEE Transactions on Information Theory 44(6), 2743–2760 (1998)

    Article  MATH  MathSciNet  Google Scholar 

  5. Fu, T.: A review on time series data mining. Engineering Applications of Artificial Intelligence 24, 164–181 (2011)

    Article  Google Scholar 

  6. Rakthanmanon, T., Keogh, E.J., Lonardi, S., Evans, S.: Time Series Epenthesis: Clustering Time Series Streams Requires Ignoring. In: Proceedings of the 11th IEEE International Conference on Data Mining (ICDM), pp. 547–556 (2011)

    Google Scholar 

  7. Mueen, A., Keogh, E.J., Zhu, Q., Cash, S., Westover, M.B.: Exact Discovery of Time Series Motifs. In: Proceedings of the SIAM International Conference on Data Mining, pp. 473–484 (2009)

    Google Scholar 

  8. Rodpongpun, S., Niennattrakul, V., Ratanamahatana, C.A.: Selective Subsequence Time Series clustering. Knowledge-Based Systems 35, 361–368 (2012)

    Article  Google Scholar 

  9. Keogh, E.J., Xi, X., Wei, L., Ratanamahatana, C.A., The, U.C.R.: The UCR time series classification/clustering homepage (2008), www.cs.ucr.edu/~eamonn/time_series_dat/

  10. Cotofrei, P., Stoffel, K.: Classification Rules + Time = Temporal Rules. In: Sloot, P.M.A., Tan, C.J.K., Dongarra, J., Hoekstra, A.G. (eds.) ICCS-ComputSci 2002, Part I. LNCS, vol. 2329, pp. 572–581. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  11. Yingchareonthawornchai, S., Sivaraks,Rodpongpun, S., Ratanamahatana, C.A.: The Proper Length Motif Discovery Algorithm. In: Proceedings of the 16th International Computer Science and Engineering Conference (ICSEC 2012), Chonburi, Thailand (2012)

    Google Scholar 

  12. Madicar, N., Sivaraks, H., Rodpongpun, S., Ratanamahatana, C.A.: Parameter-free subsequences time series clustering with various-width clusters. In: 2013 5th International Conference on Knowledge and Smart Technology (KST), pp. 150–155 (2013)

    Google Scholar 

  13. Niennattrakul, V., Wanichsan, D., Ratanamahatana, C.A.: Accurate Subsequence Matching on Data Stream under Time Warping Distance. In: Theeramunkong, T., Nattee, C., Adeodato, P.J.L., Chawla, N., Christen, P., Lenca, P., Poon, J., Williams, G. (eds.) New Frontiers in Applied Data Mining. LNCS, vol. 5669, pp. 156–167. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  14. Wang, S., Gan, W., Li, D., Li, D.: Data Field for Hierarchical Clustering. International Journal of Data Warehousing and Mining archive (IJDWM) 7(4), 43–63 (2011)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Navin Madicar .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Madicar, N., Sivaraks, H., Rodpongpun, S., Ratanamahatana, C.A. (2014). An Enhanced Parameter-Free Subsequence Time Series Clustering for High-Variability-Width Data. In: Herawan, T., Ghazali, R., Deris, M. (eds) Recent Advances on Soft Computing and Data Mining. Advances in Intelligent Systems and Computing, vol 287. Springer, Cham. https://doi.org/10.1007/978-3-319-07692-8_40

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-07692-8_40

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-07691-1

  • Online ISBN: 978-3-319-07692-8

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics