Skip to main content
Log in

Hierarchical clustering of unequal-length time series with area-based shape distance

  • Methodologies and Application
  • Published:
Soft Computing Aims and scope Submit manuscript

Abstract

Time-series clustering algorithms have been used in a variety of areas to extract valuable information from complex and massive data sets. However, these algorithms suffer from two shortcomings. On the one hand, most of them are designed for the equal-length time series, while clustering of unequal-length time series is often encountered in real-world problems. On the other hand, commonly used distance measures of time series cannot fully reveal trend differences. To overcome these two shortcomings, this paper focuses on the trend of time series and employs the area-based shape distance to measure their similarity. In addition, we present a new hierarchical clustering for unequal-length time series based on area-based shape distance measure. A series of experiments illustrates the performance of the proposed clustering algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14

Similar content being viewed by others

References

  • Aghabozorgi S, Shirkhorshidi A, Wah T (2015) Time-series clustering-a decade review. Inf Syst 53:16–38

    Article  Google Scholar 

  • Bagnall A, Janacek G (2005) Clustering time series with clipped data. Mach Learn 58(2–3):151–178

    Article  MATH  Google Scholar 

  • Berndt D, Clifford J (1994) Using dynamic time warping to find patterns in time series. KDD Workshop Seattle 10:359–370

    Google Scholar 

  • Caiado J, Crato N, Peña D (2009) Comparison of times series with unequal length in the frequency domain. Commun Stat Simul Comput 38:527–540

    Article  MathSciNet  MATH  Google Scholar 

  • Camacho M, Perez-Quiro G, Saiz L (2006) Are European business cycles close enough to be just one? J Econ Dyn Control 30(9–10):1687–1706

    Article  MATH  Google Scholar 

  • Cao D, Tian Y, Bai D (2015) Time series clustering method based on principal component analysis. In 5th International conference on information engineering for mechanics and materials, pp 888–895

  • Chen Y, Keogh E, Hu B, Begum N, Bagnall A, Mueen A, Batista G (2015a) The UCR time series classification archive. http://www.cs.ucr.edu/~eamonn/time_series_data. Accessed 25 Nov 2017

  • Chen Z, Zuo W, Hu Q, Lin L (2015b) Kernel sparse representation for time series classification. Inf Sci 292:15–26

    Article  MathSciNet  MATH  Google Scholar 

  • Dai D, Mu D (2012) A fast approach to \(K\)-means clustering for time series based on symbolic representation. Int J Adv Comput Technol 4(5):233–239

    MathSciNet  Google Scholar 

  • Dias J, Vermunt J, Ramos S (2015) Clustering financial time series: new insights from an extended hidden Markov model. Eur J Oper Res 243(3):852–864

    Article  MATH  Google Scholar 

  • Efron B, Hastie T, Johnstone I, Tibshirani R (2004) Least angle regression. Ann Stat 32(2):407–499

    Article  MathSciNet  MATH  Google Scholar 

  • Górecki T (2014) Using derivatives in a longest common subsequence dissimilarity measure for time series classification. Pattern Recognit Lett 45(1):99–105

    Article  Google Scholar 

  • http://archive.ics.uci.edu/ml/datasets.html. Accessed 29 Nov 2017

  • Izakian H, Pedrycz W, Jamal I (2015) Fuzzy clustering of time series data using dynamic time warping distance. Eng Appl Artif Intell 39:235–244

    Article  Google Scholar 

  • Keogh E, Lin J (2005) Clustering of time-series subsequences is meaningless: implications for previous and future research. Knowl Inf Syst 8:154–177

    Article  Google Scholar 

  • Keogh E, Pazzani M (2001) Derivative dynamic time warping, In: Proceedings of the SIAM international conference on data mining, Chicago, pp 5–7

  • Kim S, Koh K, Boyd S, Gorinevsky D (2009) \(l_{1}\) trend filtering. SIAM Rev 51(2):339–360

    Article  MathSciNet  MATH  Google Scholar 

  • Kini V, Sekhar C (2009) Bayesian mixture of AR models for time series clustering. Formal Pattern Anal Appl 16(2):35–38

    MathSciNet  Google Scholar 

  • Košmelj K, Batagelj V (1990) Cross-sectional approach for clustering time varying data. J Classif 7:99–109

    Article  MathSciNet  Google Scholar 

  • Lai C, Chung P, Tseng V (2010) A novel two-level clustering method for time series data analysis. Expert Syst Appl 37(9):6319–6326

    Article  Google Scholar 

  • Liang J, Zhao X, Li D, Cao F, Dang C (2012) Determining the number of clusters using information entropy for mixed data. Pattern Recognit 45(6):2251–2265

    Article  MATH  Google Scholar 

  • Liao T (2005) Clustering of time series data-a survey. Pattern Recognit 38(11):1857–1874

    Article  MATH  Google Scholar 

  • Łuczak M (2016) Hierarchical clustering of time series data with parametric derivative dynamic time warping. Expert Syst Appl 62:116–130

    Article  Google Scholar 

  • Mori U, Mendiburu A, Lozano J (2015) Similarity measure selection for clustering time series databases. IEEE Trans Knowl Data Eng 28(1):181–195

    Article  Google Scholar 

  • Nguyen H, Mclachlan G, Orban P, Bellec P, Janke A (2017) Maximum pseudolikelihood estimation for model-based clustering of time series data. Neural Comput 29(4):990–1020

    Article  MathSciNet  MATH  Google Scholar 

  • Nieto-Barajas L, Contreras-Cristán A (2014) A Bayesian nonparametric approach for time series clustering. Bayesian Anal 9(1):147–170

    Article  MathSciNet  MATH  Google Scholar 

  • Qiu X, Zhang L, Suganthan P, Amaratunga G (2017) Oblique random forest ensemble via least square estimation for time series forecasting. Inf Sci 420:249–262

    Article  Google Scholar 

  • Rosset S, Zhu J (2007) Piecewise linear regularized solution paths. Inst Math Stat 35(3):1012–1030

    MathSciNet  MATH  Google Scholar 

  • Roy A (2016) A novel multivariate fuzzy time series based forecasting algorithm incorporating the effect of clustering on prediction. Soft Comput 20(5):1991–2019

    Article  Google Scholar 

  • Sedano J, Sedano J, Camara M, Prieto C (2016) Gene clustering for time-series microarray with production outputs. Soft Comput 20(11):4301–4312

    Article  Google Scholar 

  • Silva D, Giusti R, Keogh E, Batista G (2018) Speeding up similarity search under dynamic time warping by pruning unpromising alignments. Data Min Knowl Discov. https://doi.org/10.1007/s10618-018-0557-y

    Article  MathSciNet  MATH  Google Scholar 

  • Troncoso A, Arias M, Riquelme JC (2015) A multi-scale smoothing kernel for measuring time-series similarity. Neurocomputing 167:8–17

    Article  Google Scholar 

  • Wang X, Yu F, Zhang H, Liu S, Wang J (2015) Large-scale time series clustering based on fuzzy granulation and collaboration. Int J Intell Syst 30(6):763–780

    Article  Google Scholar 

  • Wang X, Yu F, Pedrycz W (2016) An area-based shape distance measure of time series. Appl Soft Comput 48:650–659

    Article  Google Scholar 

  • Wei L, Jiang J (2010) A hidden Markov model-based K-means time series clustering algorithm. In: IEEE international conference on intelligent computing & intelligent systems, pp 135–138

  • Xiong Y, Yeung D (2004) Time series clustering with ARMA mixtures. Pattern Recognit 37(8):1675–1689

    Article  MATH  Google Scholar 

  • Yu H, Liu Z, Wang G (2014) An automatic method to determine the number of clusters using decision-theoretic rough set. Int J Approx Reason 55(1):101–115

    Article  MathSciNet  MATH  Google Scholar 

  • Yu F, Dong K, Chen F, Jiang Y, Zeng W (2007) Clustering time series with granular dynamic time warping method. In: IEEE international conference on granular computing, San Jose, CA, pp 393–398

  • Zhang Y, Mańdziuk J, Chai H, Goh B (2017) Curvature-based method for determining the number of clusters. Inf Sci 415–416:414–428

    Article  Google Scholar 

Download references

Acknowledgements

This work was funded by National Natural Science Foundation of China (Nos. 11701338, 11571001), Natural Science Foundation of Shandong Province (No. ZR2016AP12), and a Project of Shandong Province Higher Educational Science and Technology Program (No. J17KB124).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Fusheng Yu.

Ethics declarations

Conflict of interest

All authors declare that they have no conflict of interest.

Ethical approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Informed consent

Informed consent was obtained from all individual participants included in the study.

Additional information

Communicated by V. Loia.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, X., Yu, F., Pedrycz, W. et al. Hierarchical clustering of unequal-length time series with area-based shape distance. Soft Comput 23, 6331–6343 (2019). https://doi.org/10.1007/s00500-018-3287-6

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00500-018-3287-6

Keywords

Navigation