Skip to main content

Segmentation-Based Methods for Top-k Discords Detection in Static and Streaming Time Series Under Euclidean Distance

  • Conference paper
  • First Online:
Context-Aware Systems and Applications (ICCASA 2021)

Abstract

Detecting top-k discords in time series is more useful than detecting the most unusual subsequence since the result is a more informative and complete set, rather than a single subsequence. The first challenge of this task is to determine the length of discords. Besides, detecting top-k discords in streaming time series poses another challenge that is fast response when new data points arrive at high speed. To handle these challenges, we propose two novel methods, TopK-EP-ALeader and TopK-EP-ALeader-S, which combine segmentation and clustering for detecting top-k discords in static and streaming time series, respectively. Moreover, a circular buffer is built to store the local segment of a streaming time series and calculate anomaly scores efficiently. Along with this circular buffer, a delayed update policy is defined for achieving instant responses to overcome the second challenge. The experiments on nine datasets in different application domains confirm the effectiveness and efficiency of our methods for top-k discord discovery in static and streaming time series.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 69.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 89.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Ahmad, S., Lavin, A., Purdy, S., Agha, Z.: Unsupervised real-time anomaly detection for streaming data. Neurocomputing 262, 134–147 (2017)

    Article  Google Scholar 

  2. Bu, Y., Leung, T.W., Fu, A.W.C., Keogh, E., Pei, J., Meshkin, S.: WAT: Finding top-k discords in time series database. In: Proceedings of the 2007 SIAM International Conference on Data Mining, pp. 449–454 (2007)

    Google Scholar 

  3. Chen, Y., et al.: The UCR Time series Classification/Clustering. https://www.cs.ucr.edu/~eamonn/time_series_data/. Accessed 2017

  4. Fink, E., Gandhi, H.S.: Important extrema of time series. In: Proceedings of IEEE International Conference on System, Man and Cybernetics, pp. 366–372. Montreal, Canada (2007)

    Google Scholar 

  5. Giao, B.C., Anh, D.T.: Efficient search for top-k discords in streaming time series. Int. J. Bus. Intell. Data Min. 16(4), 397–417 (2020)

    Google Scholar 

  6. Hartigan, J.A.: Clustering Algorithms. Wiley, New York (1975)

    MATH  Google Scholar 

  7. He, Z., Xu, X., Deng, S.: Discovering cluster-based local outliers. Pattern Recogn. Lett. 24(9–10), 1641–1650 (2003)

    Article  Google Scholar 

  8. Keogh, E., Lin, J., Fu, A.: HOT SAX: efficiently finding the most unusual time series subsequence. In: Proceedings of the Fifth IEEE International Conference on Data mining, pp. 226–233. Houston, Texas (2005)

    Google Scholar 

  9. Linardi, M., Zhu, Y., Palpanas, T., Keogh, E.: Matrix profile goes MAD: variable-length motif and discord discovery in data series. Data Min. Knowl. Disc. 34(4), 1022–1071 (2020). https://doi.org/10.1007/s10618-020-00685-w

    Article  MathSciNet  Google Scholar 

  10. Liu, Y., Chen, X., Wang, F., Yin, J.: Efficient detection of discords for time series stream. In: Li, Q., Feng, L., Pei, J., Wang, S.X., Zhou, X., Zhu, Q.-M. (eds.) APWeb/WAIM -2009. LNCS, vol. 5446, pp. 629–634. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-00672-2_62

    Chapter  Google Scholar 

  11. Ngo, D.H., Veeravalli, B.: Design of a real-time morphology-based anomaly detection methods from ECG streams. In: Proceedings of IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 829–836 (2015)

    Google Scholar 

  12. Phien, N.N.: An efficient method for estimating time series motif length using sequitur algorithm. In: Meng, L., Zhang, Y. (eds.) MLICOM 2018. LNICSSITE, vol. 251, pp. 531–538. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-00557-3_52

    Chapter  Google Scholar 

  13. Rakthanmanon, T., et al.: Searching and mining trillions of time series subsequences under dynamic time warping. In: Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 262–270, Beijing, China (2012)

    Google Scholar 

  14. Sanchez, H., Bustos, B.: Anomaly detection in streaming time series based on bounding boxes. In: Traina, A.J.M., Traina, C., Cordeiro, R.L.F. (eds.) SISAP 2014. LNCS, vol. 8821, pp. 201–213. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-11988-5_19

    Chapter  Google Scholar 

  15. The Research Resource for Complex Physiologic Signals. https://www.physionet.org. Accessed 22 Oct 2020

  16. Thuy, H.T.T., Anh, D.T., Chau, V.T.N.: A novel method for time series anomaly detection based on segmentation and clustering. In: 2018 10th International Conference on Knowledge and Systems Engineering (KSE), pp. 276–281. IEEE (2018)

    Google Scholar 

  17. Thuy, H.T.T., Anh, D.T., Chau, V.T.N.: Incremental Clustering for time series data based on an improved leader algorithm. In: 2019 IEEE-RIVF International Conference on Computing and Communication Technologies (RIVF), pp. 1–6. IEEE (2019)

    Google Scholar 

  18. Truong, C.D., Anh, D.T.: An efficient method for motif and anomaly detection in time series based on clustering. Int. J. Bus. Intell. Data Min. 10(4), 356–377 (2015)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Huynh Thi Thu Thuy .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Thuy, H.T.T., Anh, D.T., Chau, V.T.N. (2021). Segmentation-Based Methods for Top-k Discords Detection in Static and Streaming Time Series Under Euclidean Distance. In: Cong Vinh, P., Rakib, A. (eds) Context-Aware Systems and Applications. ICCASA 2021. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 409. Springer, Cham. https://doi.org/10.1007/978-3-030-93179-7_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-93179-7_12

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-93178-0

  • Online ISBN: 978-3-030-93179-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics