Skip to main content

On the Ranking of Variable Length Discords Through a Hybrid Outlier Detection Approach

  • Conference paper
  • First Online:
Discovery Science (DS 2022)

Abstract

In this paper we are interested in identifying insightful changes in climate observations series, through outlier detection techniques. Discords are outliers that cover a certain length instead of being a single point in the time series. The choice of the length can be critical, leading to works on computing variable length discords. This increases the number of discords, with potential overlapping, subsumption and reduced insightful results. In this work we introduce a hybrid approach to rank variable length discords and extract the most prominent ones, that can yield more impactful results. We propose a ranking function over extracted variable length discords that accounts for contained point anomalies. We investigate the combination of pattern wise anomaly detection, through the Matrix Profile paradigm, with two different point wise anomaly detectors. We experimented with MAD and PROPHET algorithms based on different concepts to extract point anomalies. We tested our approach on climate observations, representing monthly runoff time series between 1902 and 2005 over the West African region. Experimental results indicate that PROPHET combined with the Matrix Profile method, yields more qualitative rankings, through an extraction of higher values of extreme events within the variable length discords.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Arnell, N.W., Lloyd-Hughes, B.: The global-scale impacts of climate change on water resources and flooding under new climate and socio-economic scenarios. Climatic Change 122(1–2), 127–140 (2014)

    Google Scholar 

  2. Boniol, P., Palpanas, T., Meftah, M., Remy, E.: Graphan: graph-based subsequence anomaly detection. Proceed. VLDB Endow. 13(12), 2941–2944 (2020)

    Article  Google Scholar 

  3. Borges, H., Akbarinia, R., Masseglia, F.: Anomaly detection in time series. In: Hameurlain, A., Tjoa, A.M. (eds.) Transactions on Large-Scale Data- and Knowledge-Centered Systems L. LNCS, vol. 12930, pp. 46–62. Springer, Heidelberg (2021). https://doi.org/10.1007/978-3-662-64553-6_3

    Chapter  Google Scholar 

  4. Chandola, V., Banerjee, A., Kumar, V.: Outlier detection: a survey. ACM Comput. Surv. 14, 15 (2007)

    Google Scholar 

  5. Chen, T., Liu, X., Xia, B., Wang, W., Lai, Y.: Unsupervised anomaly detection of industrial robots using sliding-window convolutional variational autoencoder. IEEE Access 8, 47072–47081 (2020)

    Article  Google Scholar 

  6. Ding, Z., Fei, M.: An anomaly detection approach based on isolation forest algorithm for streaming data using sliding window. IFAC Proceed. Vol. 46(20), 12–17 (2013)

    Article  Google Scholar 

  7. El Khansa, H., Gervet, C., Brouillet, A.: Prominent discord discovery with matrix profile: application to climate data insights. In: Computer Science & Technology Trends, Academy and Industry Research Collaboration Center (AIRCC) (2022)

    Google Scholar 

  8. Ghiggi, G., Humphrey, V., Seneviratne, S.I., Gudmundsson, L.: GRUN: an observation-based global gridded runoff dataset from 1902 to 2014. Earth Syst. Sci. Data 11(4), 1655–1674 (2019)

    Google Scholar 

  9. Hansson, A., Cedervall, H.: Insurance fraud detection using unsupervised sequential anomaly detection (2022)

    Google Scholar 

  10. Iglewicz, B., Hoaglin, D.C.: How to detect and handle outliers, vol. 16. ASQ Press (1993)

    Google Scholar 

  11. Keogh, E., Lin, J., Fu, A.: Hot sax: efficiently finding the most unusual time series subsequence. In: Fifth IEEE International Conference on Data Mining (ICDM2005), p. 8. IEEE (2005)

    Google Scholar 

  12. Le Gall, P., Favre, A.-C., Naveau, P., Prieur, C.: Improved regional frequency analysis of rainfall data. Weather Clim. Extremes 36, 100456 (2022)

    Article  Google Scholar 

  13. Leys, C., Ley, C., Klein, O., Bernard, P., Licata, L.: Detecting outliers: do not use standard deviation around the mean, use absolute deviation around the median. J. Exp. Soc. Psychol. 49(4), 764–766 (2013)

    Article  Google Scholar 

  14. Tianyu, Li., et al.: Anomaly scoring for prediction-based anomaly detection in time series. In: 2020 IEEE Aerospace Conference, pp. 1–7. IEEE (2020)

    Google Scholar 

  15. Yuhong, Li, Leong, H.U., Yiu, M.L., Gong, Z.: Quick-motif: an efficient and scalable framework for exact motif discovery. In: 2015 IEEE 31st International Conference on Data Engineering, pp. 579–590. IEEE (2015)

    Google Scholar 

  16. Ma, J., Perkins, S.: Time-series novelty detection using one-class support vector machines. In: Proceedings of the International Joint Conference on Neural Networks, 2003, vol. 3, pp. 1741–1745. IEEE (2003)

    Google Scholar 

  17. Madrid, F., Imani, S., Mercer, R., Zimmerman, Z., Shakibay, N., Keogh, E.: Matrix profile xx: finding and visualizing time series motifs of all lengths using the matrix profile. In: 2019 IEEE International Conference on Big Knowledge (ICBK), pp. 175–182. IEEE (2019)

    Google Scholar 

  18. Masih, I., Maskey, S., Mussá, F.E.F., Trambauer, P.: A review of droughts on the African continent: a geospatial and long-term perspective. Hydrol. Earth Syst. Sci. 18(9), 3635–3649 (2014)

    Article  Google Scholar 

  19. Miller, B., Linder, F., Mebane, W.R.: Active learning approaches for labeling text: review and assessment of the performance of active learning approaches. Polit. Anal. 28(4), 532–551 (2020)

    Google Scholar 

  20. Rousseeuw, P.J., Van Driessen, K.: A fast algorithm for the minimum covariance determinant estimator. Technometrics 41(3), 212–223 (1999)

    Google Scholar 

  21. Sanayha, M., Vateekul, P.: Fault detection for circulating water pump using time series forecasting and outlier detection. In: 2017 9th International Conference on Knowledge and Smart Technology (KST), pp. 193–198. IEEE (2017)

    Google Scholar 

  22. Senin, P., et al.: Time series anomaly discovery with grammar-based compression. In: EDBT, pp. 481–492 (2015)

    Google Scholar 

  23. Sgueglia, A., Sorbo, A.D., Visaggio, C.A., Canfora, G.: A systematic literature review of iot time series anomaly detection solutions. Fut. Gener. Comput. Syst. 134, 170–186 (2022)

    Google Scholar 

  24. Shao, Z., Yang, K., Zhou, W.: Performance evaluation of single-label and multi-label remote sensing image retrieval using a dense labeling dataset. Remote Sensing 10(6), 964 (2018)

    Article  Google Scholar 

  25. Shi, J., Yu, N., Keogh, E., Chen, H.K., Yamashita, K.: Discovering and labeling power system events in synchrophasor data with matrix profile. In: 2019 IEEE Sustainable Power and Energy Conference (iSPEC), pp. 1827–1832. IEEE (2019)

    Google Scholar 

  26. Siniosoglou, I., Radoglou-Grammatikis, P., Efstathopoulos, G., Fouliras, P., Sarigiannidis, P.: A unified deep learning anomaly detection and classification approach for smart grid environments. IEEE Trans. Netw. Serv. Manage. 18(2), 1137–1151 (2021)

    Article  Google Scholar 

  27. Taylor, S.J., Letham,. B.: Forecasting at scale. Am. Statist. 72(1), 37–45 (2018)

    Google Scholar 

  28. Wilcox, R.R.: Fundamentals of modern statistical methods: Substantially improving power and accuracy, vol. 249, 2nd edn. Springer (2001). https://doi.org/10.1007/978-1-4419-5525-8

  29. Ye, F., Liu, Z., Liu, Q., Wang, Z.: Hydrologic time series anomaly detection based on flink. Mathematical Problems in Engineering (2020)

    Google Scholar 

  30. Yeh, C.-C.M., et al.: Matrix profile I: all pairs similarity joins for time series: a unifying view that includes motifs, discords and shapelets. In: 2016 IEEE 16th international conference on data mining (ICDM), pp. 1317–1322. IEEE (2016)

    Google Scholar 

  31. Yu, Y., Zhu, Y., Li, S., Wan, D.: Time series outlier detection based on sliding window prediction. Mathematical problems in Engineering (2014)

    Google Scholar 

  32. Yue, M.: An integrated anomaly detection method for load forecasting data under cyberattacks. In: 2017 IEEE Power & Energy Society General Meeting, pp. 1–5. IEEE (2017)

    Google Scholar 

  33. Zhang, H., Guo, W., Zhang, S., Lu, H., Zhao, X.: Unsupervised Deep Anomaly Detection for Medical Images Using an Improved Adversarial Autoencoder. J. Digit. Imaging, 35, 153–161 (2021). https://doi.org/10.1007/s10278-021-00558-8

  34. Zhu, Y., et al.: Matrix profile II: exploiting a novel algorithm and GPUs to break the one hundred million barrier for time series motifs and joins. In: 2016 IEEE 16th international conference on data mining (ICDM), pp. 739–748. IEEE (2016)

    Google Scholar 

Download references

Acknowledgements

The authors would like to thank the Occitanie Region, who partially funded this research, and the reviewers for their comments and suggestions.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hussein El Khansa .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Khansa, H.E., Gervet, C., Brouillet, A. (2022). On the Ranking of Variable Length Discords Through a Hybrid Outlier Detection Approach. In: Pascal, P., Ienco, D. (eds) Discovery Science. DS 2022. Lecture Notes in Computer Science(), vol 13601. Springer, Cham. https://doi.org/10.1007/978-3-031-18840-4_24

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-18840-4_24

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-18839-8

  • Online ISBN: 978-3-031-18840-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics