Abstract
Time series anomaly detection is an important field of data science. Statistical, distance-based, clustering-based, or density-based approaches can detect anomalies. Generally, distance-based methods are relatively straightforward, but the method’s effectiveness depends on how well they handle the distribution of data points. To address the challenge, a preprocessing step is used to convert the underlying time series into a more useful format. In this paper, a novel clustering-based representation of time series is proposed. This representation is then used to compute anomaly scores and detect anomalies. Experimental studies on synthetic and real datasets show that proposed method outperforms other methods by up to 75% for five standard performance metrics.




















Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Data availibility
Yahoo S5 datasets analyzed during the current study are available at https://webscope.sandbox.yahoo.com/catalog.php?datatype=s&did=70, and the Synthetic datasets generated by https://github.com/KDD-OpenSource/agots repository. The Sin dataset is also generated by Eq. (8).
Notes
Piecewise Aggregate Approximation.
Symbolic Aggregate approXimation.
Discrete Fourier Transform.
Discrete Wavelet Transform.
Singular Value Decomposition.
Principal Component Analysis.
Gaussian Mixture Models.
Stochastic Outlier Selection.
Clustering-Based Local Outlier Factor.
Isolation Forest.
It should be noted that the clustering-based representation mechanism is completely different from the clustering-based anomaly detection approaches discussed in Sect. 3.
Optimal Sequence Clustering algorithm.
References
Akhmedova S, Stanovov V, Kamiya Y (2022) A hybrid clustering approach based on fuzzy logic and evolutionary computation for anomaly detection. Algorithms 15(10):342
Aljawarneh SA, Vangipuram R (2020) GARUDA: Gaussian dissimilarity measure for feature Representation and anomaly Detection in internet of things. J Supercomput 76(6):4376–4413
Arumugam P, Saranya R (2018) Outlier detection and missing value in seasonal ARIMA model using rainfall data. Mater Today Proc 5(1):1791–1799
Azzaoui H, Boukhamla AZE, Arroyo D, Bensayah A (2022) Developing new deep-learning model to enhance network intrusion classification. Evol Syst 13(1):17–25
Blázquez-García A, Conde A, Mori U, Lozano JA (2021) A review on outlier/anomaly detection in time series data. ACM Comput Surv (CSUR) 54(3):1–33
Bountrogiannis K, Tzagkarakis G, Tsakalides P (2021) Anomaly detection for symbolic time series representations of reduced dimensionality. In: 28th European signal processing conference (EUSIPCO), pp 2398–2402
Breunig MM, Kriegel HP, Ng RT, Sander J (2000) LOF: identifying density-based local outliers. In: Proceedings of the ACM SIGMOD international conference on management of data, pp 93–104
Carmona-Poyato Á, Fernández-García NL, Madrid-Cuevas FJ, Durán-Rosal AM (2020) A new approach for optimal time-series segmentation. Pattern Recogn Lett 135:153–159
Chadha GS, Islam I, Schwung A, Ding SX (2021) Deep convolutional clustering-based time series anomaly detection. Sensors 21(16):5488
Cheng X, Wang Z, Yang X, Xu L, Liu Y (2021) Multi-scale detection and interpretation of spatio-temporal anomalies of human activities represented by time-series. Comput Environ Urban Syst 88:101627
Choi H-C, Deng C, Park H, Hwang I (2023) Gaussian Mixture Model-Based online anomaly detection for vectored area navigation arrivals. J Aerosp Inf Syst 20(1):37–52
Cook AA, Mısırlı G, Fan Z (2019) Anomaly detection for IoT time-series data: a survey. IEEE Internet Things J 7(7):6481–6494
Fernandes M, Canito A, Corchado JM, Marreiros G (2019) Fault detection mechanism of a predictive maintenance system based on Autoregressive Integrated Moving Average models. In: Distributed computing and artificial intelligence, 16th international conference, pp 171–180
Figueroa K, Paredes R, Reyes N (2018) New permutation is similarity measures for proximity searching. In: International conference on similarity search and applications, pp 122–133
Fox AJ (1972) Outliers in time series. J R Stat Soc Ser B (Methodol) 34(3):350–363
Geiger A, Liu D, Alnegheimish S, Cuesta-Infante A, Veeramachaneni K (2020) Tadgan: Time series anomaly detection using generative adversarial networks. In: IEEE international conference on big data (Big Data), pp 33–43
Ghalyan IF, Ghalyan NF, Ray A (2021) Optimal window-symbolic time series analysis for pattern classification and anomaly detection. IEEE Trans Industr Inf 18(4):2614–2621
Hagemann T, Katsarou K (2020) Reconstruction-based anomaly detection for the cloud: a comparison on the Yahoo! Webscope S5 dataset. In: Proceedings of the 4th international conference on cloud and big data computing, pp 68–75
He Z, Xu X, Deng S (2003) Discovering cluster-based local outliers. Pattern Recogn Lett 24(9–10):1641–1650
Huang K, Wu Y, Wen H, Liu Y, Yang C, Gui W (2020) Distributed dictionary learning for high-dimensional process monitoring. Control Eng Pract 98:104386
Hundman K, Constantinou V, Laporte C, Colwell I, Soderstrom T (2018) Detecting spacecraft anomalies using LSTM and nonparametric dynamic thresholding. In: Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery & data mining, pp 387–395
Janssens J, Huszár F, Postma E, van den Herik H (2012) Stochastic outlier selection. Tilburg centre for Creative Computing, techreport 2012-001
Keogh E, Chakrabarti K, Pazzani M, Mehrotra S (2001) Dimensionality reduction for fast similarity search in large time series databases. Knowl Inf Syst 3:263–286
Li J, Izakian H, Pedrycz W, Jamal I (2021) Clustering-based anomaly detection in multivariate time series data. Appl Soft Comput 100:106919
Liang H, Song L, Wang J, Guo L, Li X, Liang J (2021) Robust unsupervised anomaly detection via multi-time scale DCGANs with forgetting mechanism for industrial multivariate time series. Neurocomputing 423:444–462
Lin CR, Chen MS (2002) On the optimal clustering of sequential data. In: Proceedings of the SIAM international conference on data mining, pp 141–157
Lindemann B, Maschler B, Sahlab N, Weyrich M (2021) A survey on anomaly detection for technical systems using LSTM networks. Comput Ind 131:103498
Liu FT, Ting KM, Zhou ZH (2008) Isolation forest. In: Eighth IEEE international conference on data mining, pp 413–422
Liu Y, Garg S, Nie J, Zhang Y, Xiong Z, Kang J, Hossain MS (2020) Deep anomaly detection for time-series data in industrial IoT: a communication-efficient on-device federated learning approach. IEEE Internet Things J 8(8):6348–6358
Maciąg PS, Kryszkiewicz M, Bembenik R, Lobo JL, Del Ser J (2021) Unsupervised anomaly detection in stream data with online evolving spiking neural networks. Neural Netw 139:118–139
Mahmoodi K, Ketabdari MJ, Vaghefi M (2021) Proposing a new local density estimation outlier detection algorithm: an empirical case study on flow pattern experiments. Pattern Anal Appl 24:1859–1872
Munir M, Siddiqui SA, Dengel A, Ahmed S (2018) DeepAnT: a deep learning approach for unsupervised anomaly detection in time series. IEEE Access 7:1991–2005
Pérez D, Alonso S, Morán A, Prada MA, Fuertes JJ, Domínguez M (2021) Evaluation of feature learning for anomaly detection in network traffic. Evol Syst 12(1):79–90
Pham V, Nguyen N, Li J, Hass J, Chen Y, Dang T (2019) MTSAD: multivariate time series abnormality detection and visualization. In: 2019 IEEE international conference on big data (Big Data), pp 3267–3276
Pramitarini Y, Perdana RHY, Tran T-N, Shim K, An B (2022) A hybrid price auction-based secure routing protocol using advanced speed and cosine similarity-based clustering against sinkhole attack in VANETs. Sensors 22(15):5811
Ramotsoela DT, Hancke GP, Abu-Mahfouz AM (2019) Attack detection in water distribution systems using machine learning. HCIS 9(1):1–22
Reddy A, Ordway-West M, Lee M, Dugan M, Whitney J, Kahana R, Ford B, Muedsam J, Henslee A, Rao M (2017) Using Gaussian Mixture Models to detect outliers in seasonal univariate network traffic. In: IEEE security and privacy workshops (SPW). IEEE, San Jose, CA, USA, pp 229–234
Ren H, Liu M, Li Z, Pedrycz W (2017) A Piecewise Aggregate pattern representation Approach for anomaly detection in time series. Knowl-Based Syst 135:29–39
Ren H, Li X, Li Z, Pedrycz W (2018) Data representation based on interval-sets for anomaly detection in time series. IEEE Access 6:27473–27479
Ren H, Xu B, Wang Y, Yi C, Huang C, Kou X, Xing T, Yang M, Tong J, Zhang Q (2019) Time-series anomaly detection service at Microsoft. In: Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery & data mining, pp 3009–3017
Sim KH, Sim KY, Bong N (2018) Dynamic time interval data representation in scalable financial time series pattern recognition. In: ACM international conference proceeding series, pp 120–125
Singh K, Upadhyaya S (2012) Outlier detection: applications and techniques. Int J Comput Sci Issues (IJCSI) 9(1):307
Steland A, Rafajłowicz E, Szajowski K (2015) Stochastic models. Statistics and their applications. Springer, Wrocław
Tran L, Mun MY, Shahabi C (2020) Real-time distance-based outlier detection in data streams. Proc VLDB Endowm 14(2):141–153
Tukey JW (1977) Exploratory data analysis. Addison-Wesley, Reading
Wahid A, Rao ACS (2019) A distance-based outlier detection using particle swarm optimization technique. In: Information and communication technology for competitive strategies: proceedings of third international conference on ICTCS, pp 633–643
Wang Z, Fan Y (2022) Density-based structure preserving projections process monitoring model for fused magnesia smelting process. In: IEEE transactions on industrial informatics, pp 1–12
Wang D, Liu H, Pedrycz W, Song W, Li H (2022) Design Gaussian information granule based on the principle of justifiable granularity: a multi-dimensional perspective. Expert Syst Appl 197:116763
Wang Z, Wang Y, Gao C, Wang F, Lin T, Chen Y (2022) An adaptive sliding window for anomaly detection of time series in wireless sensor networks. Wirel Netw:1–19
Yang Y, Chen L, Fan C (2021) ELOF: fast and memory-efficient anomaly detection algorithm in data streams. Soft Comput 25(6):4283–4294
Yazdi SV, Douzal-Chouakria A (2018) Time warp invariant kSVD: sparse coding and dictionary learning for time series under time warp. Pattern Recogn Lett 112:1–8
Yu M, Sun S (2020) Policy-based reinforcement learning for time series anomaly detection. Eng Appl Artif Intell 95:103919
Zhang C, Zuo W, Yin A, Wang X, Liu C (2021) ADET: Anomaly DEtection in time series with linear Time. Int J Mach Learn Cybern 12(1):271–280
Zhang W, Lin Z, Liu X (2022) Short-term offshore wind power forecasting-a hybrid model based on Discrete Wavelet Transform (DWT), Seasonal Autoregressive Integrated Moving Average (SARIMA), and deep-learning-based Long Short-Term Memory (LSTM). Renew Energy 185:611–628
Zhou ZG, Tang P (2016) Improving time series anomaly detection based on Exponentially Weighted Moving Average (EWMA) of season-trend model residuals. In: IEEE international geoscience and remote sensing symposium (IGARSS), pp 3414–3417
Zhou Y, Ren H, Li Z, Pedrycz W (2021) An anomaly detection framework for time series data: an interval-based approach. Knowl-Based Syst 288:107153
Zhou Y, Ren H, Li Z, Wu N, Al-Ahmari AM (2021) Anomaly detection via a combination model in time series data. Appl Intell 51(7):4874–4887
Zhu X, Pedrycz W, Li Z (2016) Granular encoders and decoders: a study in processing information granules. IEEE Trans Fuzzy Syst 25(5):1115–1126
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix: Tables of Scenario 2
Appendix: Tables of Scenario 2
The numerical results in Tables 5, 6, 7, 8 are presented by a confidence interval of 95%.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Enayati, E., Mortazavi, R., Basiri, A. et al. Time series anomaly detection via clustering-based representation. Evolving Systems 15, 1115–1136 (2024). https://doi.org/10.1007/s12530-023-09543-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12530-023-09543-8