Skip to main content
Log in

Novelets: a new primitive that allows online detection of emerging behaviors in time series

  • Regular paper
  • Published:
Knowledge and Information Systems Aims and scope Submit manuscript

Abstract

Much of the world’s data are time series. While offline exploration of time series can be useful, time series is almost unique in allowing the possibility of direct and immediate intervention. For example, if we are monitoring an industrial process and have an algorithm that predicts imminent failure, we could direct a controller to open a pressure release valve or initiate an evacuation plan. There is a plethora of tools to monitor time series for known behaviors (pattern matching), previously unknown highly conserved behaviors (motifs), evolving behaviors (chains) and unexpected behaviors (anomalies). In this work, we claim that there is another useful primitive, emerging behaviors that are worth monitoring for. We call such behaviors Novelets. We explain that Novelets are not anomalies, chains, or motifs but can be informally thought of as initially apparent anomalies that are later discovered to be motifs. We will show that Novelets have a natural interpretation in many disciplines, including science, medicine, and industry. As we will further demonstrate, Novelet discovery can have many downstream uses, including prognostics and abnormal behavior detection. We will demonstrate the utility of our proposed primitive on a diverse set of challenging domains.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20
Fig. 21
Fig. 22
Fig. 23
Fig. 24

Similar content being viewed by others

Notes

  1. The Roman satirist Juvenal wrote in AD 82 of rara avis in terris nigroque simillima cygno (“a rare bird in the lands, and very like a black swan”), meaning that since a black swan did not exist, the proposed “rare bird” did not exist. Here, “rare bird” was not literally a bird; it is just something that did not exist, like an honest politician.

  2. This story is reminiscent of, but is distinct from, the famous story of the discovery of the first computer “bug” (a moth) by Dr. Grace Hopper in 1945.

References

  1. Aghabozorgi S, Seyed Shirkhorshidi A, Ying Wah T (2015) Time-series clustering – a decade review. Inf Syst 53:16–38

    Article  Google Scholar 

  2. Beecher MD, Campbell SE (2005) The role of unshared songs in singing interactions between neighbouring song sparrows. Anim Behav 70(6):1297–1304

    Article  Google Scholar 

  3. Begum N, Keogh E (2014) Rare time series motif discovery from unbounded streams. Proc VLDB 8(2):149–160

    Article  Google Scholar 

  4. Benichov JI, Benezra SE, Vallentin D, Globerson E, Long MA, Tchernichovski O (2016) The forebrain song system mediates predictive call timing in female and male zebra finches. Curr Biol 26(3):309–318

    Article  Google Scholar 

  5. Berwick RC, Okanoya K, Beckers GJL, Bolhuis JJ (2011) Songs to syntax: the linguistics of birdsong. Trends Cogn Sci 15(3):113–121

    Article  Google Scholar 

  6. Blázquez-García A, Conde A, Mori U, Lozano JA (2021) A review on outlier/anomaly detection in time series data. ACM Comput Surv 54(3):5:61-56:33

    Google Scholar 

  7. Case Western Reserve University Bearing Data Center (2021) School of engineering. https://engineering.case.edu/bearingdatacenter. Accessed 19 Apr 2022

  8. Chakraborty D, Mukker P., Rajan P., Dileep AD (2016) Bird call identification using dynamic kernel based support vector machines and deep neural networks. In: 2016 15th IEEE international conference on machine learning and applications (ICMLA). pp 280–285

  9. Chandola V, Banerjee A, Kumar V (2009) Anomaly detection: a survey. ACM Comput Surv 41(3):15:1-15:58

    Article  Google Scholar 

  10. Davis S, Mermelstein P (1980) Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE Trans Acoust 28(4):357–366

    Article  Google Scholar 

  11. Fu T-C (2011) A review on time series data mining. Eng Appl Artif Intell 24(1):164–181

    Article  Google Scholar 

  12. Gharghabi S, Ding Y, Yeh C-CM, Kamgar K, Ulanova L, Keogh E. (2017) Matrix profile VIII: domain agnostic online semantic segmentation at superhuman performance levels. In: 2017 ICDM. pp 117–126

  13. Goldberger AL et al (2000) PhysioBank, PhysioToolkit, and PhysioNet. Circulation 101(23):e215–e220. https://doi.org/10.1161/01.CIR.101.23.e215

    Article  Google Scholar 

  14. Johnson C (2023) These techniques find bearing faults. Efficient plant. https://www.efficientplantmag.com/2023/04/these-techniques-find-bearing-faults/. Accessed 31 May 2023

  15. Kemp B et al (2000) Analysis of a sleep-dependent neuronal feedback loop: the slow-wave microcontinuity of the EEG. IEEE Trans Biomed Eng 47(9):1185–1194

    Article  Google Scholar 

  16. Keogh E, Lin J (2005) Clustering of time-series subsequences is meaningless. Knowl Inf Syst 8(2):154–177

    Article  Google Scholar 

  17. Lawson RW (1950) Blinking and sleep. Nature 165(4185):4185. https://doi.org/10.1038/165081b0

    Article  Google Scholar 

  18. LesleytheBirdNerd (2021) The white-throated sparrow | adorable songster of the North. [Online Video]. Available: https://www.youtube.com/watch?v=KsBj5nL0yUs. Accessed 02 May 2022

  19. Lu Y, Wu R, Mueen A, Zuluaga MA, Keogh E (2022) Matrix profile XXIV: scaling time series anomaly detection to trillions of datapoints and ultra-fast arriving data streams. In: Proceedings of the 28th ACM SIGKDD conference on knowledge discovery and data mining, Washington DC, USA. pp 1173–1182

  20. Madrid F, Imani S, Mercer R, Zimmerman Z, Shakibay N, Keogh E (2019) Matrix profile XX: finding and visualizing time series motifs of all lengths using the matrix profile. In: 2019 IEEE international conference on big knowledge (ICBK). pp 175–182

  21. Mercer R, Alaee S, Abdoli A, Singh S, Murillo A, Keogh E (2021) Matrix profile XXIII: contrast profile: a novel time series primitive that allows real world classification. In: 2021 ICDM. pp 1240–45

  22. Mercer R, Keogh E (2022) Matrix profile XXV: introducing novelets: a primitive that allows online detection of emerging behavior in time series. In: 2022 IEEE international conference on data mining (ICDM). IEEE

  23. Mueen A et al (2015) The fastest similarity search algorithm for time series subsequences under Euclidean distance. www.cs.unm.edu/~mueen/FastestSimilaritySearch.html. Accessed 18 Jan 2021

  24. Muller A et al (2008) Formalisation of a new prognosis model for supporting proactive maintenance implementation. Reliab Eng Syst Saf 93(2):234–253

    Article  Google Scholar 

  25. Neupane D, Seok J (2020) Bearing fault detection and diagnosis using case western reserve university dataset with deep learning approaches: a review. IEEE Access 8:93155–93178. https://doi.org/10.1109/ACCESS.2020.2990528

    Article  Google Scholar 

  26. Novelets Supporting Website: https://sites.google.com/view/novelets

  27. Otter KA, Mckenna A, LaZerte SE, Ramsay SM (2020) Continent-wide shifts in song dialects of white-throated sparrows. Curr Biol 30(16):3231-3235.e3

    Article  Google Scholar 

  28. Palshikar GK (2009) Simple-algorithms-for-peak-detection-in-time-series.pdf. In: Proc. 1st Int. Conf. advanced data analysis, business analytics and intelligence, vol 122, [Online]. Available https://www.researchgate.net/publication/228853276

  29. Pedestrian Counting System (2013) City of melbourne - pedestrian counting system. www.pedestrian.melbourne.vic.gov.au/#date=28-10-2021&time=8. Accessed 27 Oct 2021

  30. Sumukha BN, Kumar RC, Bharadwaj SS, George K (2017) Online peak detection in photoplethysmogram signals using sequential learning algorithm. In: 2017 international joint conference on neural networks (IJCNN). pp 1313–1320

  31. TheSilentWatcher (2017) 4K forest birdsong 2 - birds sing in the woods - no loop realtime birdsong - relaxing nature video. [Online Video]. Available https://www.youtube.com/watch?v=XxP8kxUn5bc. Accessed 02 May 2022

  32. Thornton P (2021) Digoxin uses, dosage & side effects. Drugs.com. www.drugs.com/digoxin.html. Accessed 08 Mar 2022

  33. Wetzel C (2020) Sparrows are singing a new song, in a rapid, unprecedented shift. Animals. https://www.nationalgeographic.com/animals/article/new-sparrow-birdsong-replaces-old-tune. Accessed 08 Mar 2022

  34. White-crowned Sparrow (audio recording). Retrieved May 5th 2022. Recordist Ian Cruickshank. https://xeno-canto.org/251101

  35. Wolfram|Alpha. https://www.wolframalpha.com. Accessed 10 May 2022. With query [weight of Bombus californicus], and query [weight of Musca domestica]

  36. Yeh CM et al. (2016) Matrix profile I: All pairs similarity joins for time series: a unifying view that includes motifs, discords and shapelets. In: 2016 IEEE 16th ICDM. pp 1317–1322

  37. Yeh CM, Zhu Y, Dau HA, Darvishzadeh A, Noskov M, Keogh E (2019) Online amnestic DTW to allow real-time golden batch monitoring. In: ACM SIGKDD. pp 2604–2612

  38. Zhang A, Song S, Wang J, Yu PS (2017) Time series data cleaning: from anomaly detection to anomaly repairing. Proc VLDB Endow 10(10):1046–1057. https://doi.org/10.14778/3115404.3115410

    Article  Google Scholar 

  39. Zhu Y et al. (2016) Matrix profile II: exploiting a novel algorithm and GPUs to break the one hundred million barrier for time series motifs and joins. In: 2016 IEEE 16th international conference on data mining (ICDM). pp 739–748

  40. Zhu Y, Imamura M, Nikovski D, Keogh E (2019) Introducing time series chains: a new primitive for time series data mining. Knowl Inf Syst 60(2):1135–1161

    Article  Google Scholar 

  41. Zimmerman Z et al (2018) Scaling time series motif discovery with GPUs: breaking the quintillion pairwise comparisons a day barrier. In: Proceedings of the ACM symposium on cloud computing

Download references

Funding

We acknowledge funding from NSF award IIS 2103976.

Author information

Authors and Affiliations

Authors

Contributions

R.M. wrote the main manuscript text and prepared figures. All authors reviewed the manuscript.

Corresponding author

Correspondence to Ryan Mercer.

Ethics declarations

Conflict of interest

The authors declare they have no financial interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Mercer, R., Keogh, E. Novelets: a new primitive that allows online detection of emerging behaviors in time series. Knowl Inf Syst 66, 59–87 (2024). https://doi.org/10.1007/s10115-023-01936-0

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10115-023-01936-0

Keywords

Navigation