Skip to main content
Log in

CBR: An Effective Clustering Approach for Time Series Events

  • Published:
Neural Processing Letters Aims and scope Submit manuscript

Abstract

As technology advances, a large number of time series data have emerged in all walks of life. Clustering is a key technique for analysing time series data. However, most of the existing clustering methods calculate the distance of a single discrete data point, but cannot be applied to continuous time-series data with structural distortion (e.g., expansion, contraction, and drift) and noise (e.g., pseudo-event), resulting in low clustering accuracy. In this paper, a novel time series event clustering approach called CBR(Clustering Based on Representative sequences) is proposed. We first introduce a cross-correlation method to measure the distance between sequences with structural distortion, and propose an r-nearest neighbor evaluation system for sequences to construct candidate sets of R-Seqs(Representative sequences) and eliminate pseudo-event interference. Secondly, we formulate composite selection approaches for R-Seqs based on combinatorial optimization and diversifying top-k query to rapidly derive the R-Seqs optimal solution from the candidate sets. Finally, relying on the dynamically constructed distance matrix of R-Seqs and dataset, a matrix clustering method based on K-means is proposed to achieve an efficient division of event classes. Experimental results demonstrate that CBR is superior to the existing approaches in clustering accuracy, efficiency and denoising quality, especially the clustering accuracy is improved by more than 30% on average .

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

References

  1. Aghabozorgi S, Shirkhorshidi AS, Wah TY (2015) Time-series clustering-a decade review. Inf Syst 53:16–38

    Article  Google Scholar 

  2. Rodriguez MZ, Comin CH, Casanova D, Bruno OM, Amancio DR, Costa LdF, Rodrigues FA (2019) Clustering algorithms: a comparative approach. PloS One 14(1):e0210236

    Article  Google Scholar 

  3. Saxena A, Prasad M, Gupta A, Bharill N, Patel OP, Tiwari A, Er MJ, Ding W, Lin CT (2017) A review of clustering techniques and developments. Neurocomputing 267:664–681

    Article  Google Scholar 

  4. Hastie T, Tibshirani R, Friedman J (2009) Unsupervised learning. the elements of statistical learning. Springer, New York

    Book  Google Scholar 

  5. Celebi ME, Aydin K (2016) Unsupervised learning algorithms. Springer, New York

    Book  Google Scholar 

  6. Bagnall A, Lines J, Bostrom A, Large J, Keogh E (2017) The great time series classification bake off: a review and experimental evaluation of recent algorithmic advances. Data Min Knowl Discov 31(3):606–660

    Article  MathSciNet  Google Scholar 

  7. Fawaz HI, Forestier G, Weber J, Idoumghar L, Muller PA (2019) Deep learning for time series classification: a review. Data Min Knowl Discov 33(4):917–963

    Article  MathSciNet  Google Scholar 

  8. Sohail MN, Jiadong R, Uba MM, Irshad M (2019) A comprehensive looks at data mining techniques contributing to medical data growth: a survey of researcher reviews. recent developments in intelligent computing, communication and devices. Springer, New York

    Google Scholar 

  9. Zhao J, Itti L (2018) Shapedtw: shape dynamic time warping. Pattern Recogn 74:171–184

    Article  Google Scholar 

  10. Gidea M, Goldsmith D, Katz Y, Roldan P, Shmalo Y (2020) Topological recognition of critical transitions in time series of cryptocurrencies. Phys A: Stat Mech Appl pp, 123843

  11. Li D, Tian Y (2018) Survey and experimental study on metric learning methods. Neural Netw 105:447–462

    Article  Google Scholar 

  12. Barnett AH, Magland J, af Klinteberg L (2019) A parallel nonuniform fast fourier transform library based on an “exponential of semicircle’’ kernel. SIAM J Scientif Comput 41(5):C479–C504

    Article  MathSciNet  Google Scholar 

  13. Bryant A, Cios K (2018) Rnn-dbscan: a density-based clustering algorithm using reverse nearest neighbor density estimates. IEEE Trans Knowl Data Eng 30(6):1109–1121

    Article  Google Scholar 

  14. Qin L, Yu JX, Chang L (2012) Diversifying top-k results. http://arxiv.org/abs/1208.0076

  15. Hallac D, Vare S, Boyd S, Leskovec J (2017) Toeplitz inverse covariance-based clustering of multivariate time series data. In: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp 215–223

  16. Wei LY (2016) A hybrid anfis model based on empirical mode decomposition for stock time series forecasting. Appl Soft Comput 42:368–376

    Article  Google Scholar 

  17. Nguyen H, Drebenstedt C, Bui XN, Bui DT (2020) Prediction of blast-induced ground vibration in an open-pit mine by a novel hybrid model based on clustering and artificial neural network. Nat Resour Res 29(2):691–709

    Article  Google Scholar 

  18. Liu Z, Li X, Luo P, Loy CC, Tang X (2017) Deep learning markov random field for semantic segmentation. IEEE Trans Pattern Anal Mach Intell 40(8):1814–1828

    Article  Google Scholar 

  19. Wang A, Cho K (2019) Bert has a mouth, and it must speak: Bert as a markov random field language model. http://arxiv.org/abs/1902.04094

  20. Paparrizos J, Gravano L (2017) Fast and accurate time-series clustering. ACM Trans Database Syst(TODS) 42(2):1–49

    Article  MathSciNet  Google Scholar 

  21. Liu Y, Chen J, Wu S, Liu Z, Chao H (2018) Incremental fuzzy c medoids clustering of time series data using dynamic time warping distance. PloS One 13(5):e0197499

    Article  Google Scholar 

  22. Senin P (2008) Dynamic time warping algorithm review. Information and Computer Science Department University of Hawaii at Manoa Honolulu, USA 855(1–23):40

  23. Zakaria J, Mueen A, Keogh E (2012) Clustering time series using unsupervised-shapelets. In: 2012 IEEE 12th International Conference on Data Mining, IEEE, pp 785–794

  24. Rashno E, Minaei-Bidgoli B, Guo Y (2020) An effective clustering method based on data indeterminacy in neutrosophic set domain. Eng Appl Artif Intell 89:103411

    Article  Google Scholar 

  25. Ali M, Dat LQ, Smarandache F et al (2018) Interval complex neutrosophic set: formulation and applications in decision-making. Int J Fuzzy Syst 20(3):986–999

    Article  Google Scholar 

  26. Bandara K, Bergmeir C, Smyl S (2020) Forecasting across time series databases using recurrent neural networks on groups of similar series: a clustering approach. Expert Syst Appl 140:112896

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Baoyan Song.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: The National Natural Science Foundation of China(No. 61502215, 51704138), the China Postdoctoral Science Foundation(No. 2020M672134), the Scientific Research Project of the Educational Department of Liaoning Province(No. LJC201913, No. LJKZ0094).

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, J., Ma, R., Xia, L. et al. CBR: An Effective Clustering Approach for Time Series Events. Neural Process Lett 54, 3401–3423 (2022). https://doi.org/10.1007/s11063-022-10763-3

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11063-022-10763-3

Keywords

Navigation