Abstract
Detecting top-k discords in time series is more useful than detecting the most unusual subsequence since the result is a more informative and complete set, rather than a single subsequence. The first challenge of this task is to determine the length of discords. Besides, detecting top-k discords in streaming time series poses another challenge that is fast response when new data points arrive at high speed. To handle these challenges, we propose two novel methods, TopK-EP-ALeader and TopK-EP-ALeader-S, which combine segmentation and clustering for detecting top-k discords in static and streaming time series, respectively. Moreover, a circular buffer is built to store the local segment of a streaming time series and calculate anomaly scores efficiently. Along with this circular buffer, a delayed update policy is defined for achieving instant responses to overcome the second challenge. The experiments on nine datasets in different application domains confirm the effectiveness and efficiency of our methods for top-k discord discovery in static and streaming time series.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Ahmad, S., Lavin, A., Purdy, S., Agha, Z.: Unsupervised real-time anomaly detection for streaming data. Neurocomputing 262, 134–147 (2017)
Bu, Y., Leung, T.W., Fu, A.W.C., Keogh, E., Pei, J., Meshkin, S.: WAT: Finding top-k discords in time series database. In: Proceedings of the 2007 SIAM International Conference on Data Mining, pp. 449–454 (2007)
Chen, Y., et al.: The UCR Time series Classification/Clustering. https://www.cs.ucr.edu/~eamonn/time_series_data/. Accessed 2017
Fink, E., Gandhi, H.S.: Important extrema of time series. In: Proceedings of IEEE International Conference on System, Man and Cybernetics, pp. 366–372. Montreal, Canada (2007)
Giao, B.C., Anh, D.T.: Efficient search for top-k discords in streaming time series. Int. J. Bus. Intell. Data Min. 16(4), 397–417 (2020)
Hartigan, J.A.: Clustering Algorithms. Wiley, New York (1975)
He, Z., Xu, X., Deng, S.: Discovering cluster-based local outliers. Pattern Recogn. Lett. 24(9–10), 1641–1650 (2003)
Keogh, E., Lin, J., Fu, A.: HOT SAX: efficiently finding the most unusual time series subsequence. In: Proceedings of the Fifth IEEE International Conference on Data mining, pp. 226–233. Houston, Texas (2005)
Linardi, M., Zhu, Y., Palpanas, T., Keogh, E.: Matrix profile goes MAD: variable-length motif and discord discovery in data series. Data Min. Knowl. Disc. 34(4), 1022–1071 (2020). https://doi.org/10.1007/s10618-020-00685-w
Liu, Y., Chen, X., Wang, F., Yin, J.: Efficient detection of discords for time series stream. In: Li, Q., Feng, L., Pei, J., Wang, S.X., Zhou, X., Zhu, Q.-M. (eds.) APWeb/WAIM -2009. LNCS, vol. 5446, pp. 629–634. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-00672-2_62
Ngo, D.H., Veeravalli, B.: Design of a real-time morphology-based anomaly detection methods from ECG streams. In: Proceedings of IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 829–836 (2015)
Phien, N.N.: An efficient method for estimating time series motif length using sequitur algorithm. In: Meng, L., Zhang, Y. (eds.) MLICOM 2018. LNICSSITE, vol. 251, pp. 531–538. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-00557-3_52
Rakthanmanon, T., et al.: Searching and mining trillions of time series subsequences under dynamic time warping. In: Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 262–270, Beijing, China (2012)
Sanchez, H., Bustos, B.: Anomaly detection in streaming time series based on bounding boxes. In: Traina, A.J.M., Traina, C., Cordeiro, R.L.F. (eds.) SISAP 2014. LNCS, vol. 8821, pp. 201–213. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-11988-5_19
The Research Resource for Complex Physiologic Signals. https://www.physionet.org. Accessed 22 Oct 2020
Thuy, H.T.T., Anh, D.T., Chau, V.T.N.: A novel method for time series anomaly detection based on segmentation and clustering. In: 2018 10th International Conference on Knowledge and Systems Engineering (KSE), pp. 276–281. IEEE (2018)
Thuy, H.T.T., Anh, D.T., Chau, V.T.N.: Incremental Clustering for time series data based on an improved leader algorithm. In: 2019 IEEE-RIVF International Conference on Computing and Communication Technologies (RIVF), pp. 1–6. IEEE (2019)
Truong, C.D., Anh, D.T.: An efficient method for motif and anomaly detection in time series based on clustering. Int. J. Bus. Intell. Data Min. 10(4), 356–377 (2015)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering
About this paper
Cite this paper
Thuy, H.T.T., Anh, D.T., Chau, V.T.N. (2021). Segmentation-Based Methods for Top-k Discords Detection in Static and Streaming Time Series Under Euclidean Distance. In: Cong Vinh, P., Rakib, A. (eds) Context-Aware Systems and Applications. ICCASA 2021. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 409. Springer, Cham. https://doi.org/10.1007/978-3-030-93179-7_12
Download citation
DOI: https://doi.org/10.1007/978-3-030-93179-7_12
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-93178-0
Online ISBN: 978-3-030-93179-7
eBook Packages: Computer ScienceComputer Science (R0)