Skip to main content

Target Class Supervised Sample Length and Training Sample Reduction of Univariate Time Series

  • Conference paper
  • First Online:
Advances and Trends in Artificial Intelligence. From Theory to Practice (IEA/AIE 2021)

Abstract

The anomaly/novelty detection in time series data analysis is one of the most admired area of research, which is specifically a one-class classification problem. Concerning univariate time series data, apart from a huge number of training samples the higher sample span (observation length) also adds computation overhead along with its intrinsic issue of curse of dimensionality (sample length is considered as dimension). In this context, the present research proposes a concurrent way of sample span (treated as dimension) and training sample reduction approach for univariate time series data under the supervision of target class samples. Data representation of time series decides the performance of any machine learning approach, therefore the present research utilizes dissimilarity-based representation (DBR) techniques for time series data representation and later to reduce the sample length, a knowledge grid is computed via eigen space analysis of variance-covariance of target class samples. This knowledge grid is further used to transform the original sample length to reduced one. Afterwards, the training samples are selected using prototype methods. For experiments 16 different DBR measures are used along with 11 prototype techniques. Finally, one-class support vector machine (OCSVM) and 1-nearest neighbour (1-NN) are utilized for classification to validate the performance of proposed approach over 85 UCR/UEA univariate datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Alam, S., Sonbhadra, S.K., Agarwal, S., Nagabhushan, P.: One-class support vector classifiers: a survey. Knowl.-Based Syst. 196, 105754 (2020)

    Article  Google Scholar 

  2. Alam, S., Sonbhadra, S.K., Agarwal, S., Nagabhushan, P., Tanveer, M.: Sample reduction using farthest boundary point estimation (FBPE) for support vector data description (SVDD). Pattern Recognit. Lett. 131, 268–276 (2020)

    Article  Google Scholar 

  3. Badhiye, S.S., Chatur, P.: A review on time series dimensionality reduction. HELIX 8(5), 3957–3960 (2018)

    Article  Google Scholar 

  4. Bagnall, A., Lines, J., Bostrom, A., Large, J., Keogh, E.: The great time series classification bake off: a review and experimental evaluation of recent algorithmic advances. Data Min. Knowl. Discov. 31(3), 606–660 (2016). https://doi.org/10.1007/s10618-016-0483-9

    Article  MathSciNet  Google Scholar 

  5. Cassisi, C., Montalto, P., Aliotta, M., Cannata, A., Pulvirenti, A.: Similarity measures and dimensionality reduction techniques for time series data mining. In: Advances in Data Mining Knowledge Discovery and Applications, pp. 71–96 (2012)

    Google Scholar 

  6. Chalapathy, R., Chawla, S.: Deep learning for anomaly detection: a survey. arXiv preprint arXiv:1901.03407 (2019)

  7. Chen, L., Özsu, M.T., Oria, V.: Robust and fast similarity search for moving object trajectories. In: Proceedings of the 2005 ACM SIGMOD International Conference on Management of Data, pp. 491–502 (2005)

    Google Scholar 

  8. Costa, Y.M.G., Bertolini, D., Britto, A.S., Cavalcanti, G.D.C., Oliveira, L.E.S.: The dissimilarity approach: a review. Artif. Intell. Rev. 53(4), 2783–2808 (2019). https://doi.org/10.1007/s10462-019-09746-z

    Article  Google Scholar 

  9. Dau, H.A., et al.: The UCR time series archive. IEEE/CAA J. Automatica Sinica 6(6), 1293–1305 (2019)

    Article  Google Scholar 

  10. De Amorim, R.C., Mirkin, B.: Minkowski metric, feature weighting and anomalous cluster initializing in k-means clustering. Pattern Recognit. 45(3), 1061–1075 (2012)

    Article  Google Scholar 

  11. Duin, R.P., Pękalska, E.: The dissimilarity representation for pattern recognition: a tutorial. Tech. rep., Technical Report (2009)

    Google Scholar 

  12. Duin, R.P., Roli, F., de Ridder, D.: A note on core research issues for statistical pattern recognition. Pattern Recognit. Lett. 23(4), 493–499 (2002)

    Article  Google Scholar 

  13. Esling, P., Agon, C.: Time-series data mining. ACM Comput. Surv. (CSUR) 45(1), 1–34 (2012)

    Article  Google Scholar 

  14. Fu, T.C.: A review on time series data mining. Eng. Appl. Artif. Intell. 24(1), 164–181 (2011)

    Article  Google Scholar 

  15. Garcia, S., Derrac, J., Cano, J., Herrera, F.: Prototype selection for nearest neighbor classification: taxonomy and empirical study. IEEE Trans. Pattern Anal. Mach. Intell. 34(3), 417–435 (2012)

    Article  Google Scholar 

  16. Geun Kim, M.: Multivariate outliers and decompositions of Mahalanobis distance. Commun. Stat.-Theory Methods 29(7), 1511–1526 (2000)

    Article  MathSciNet  Google Scholar 

  17. Giusti, R., Batista, G.: An empirical comparison of dissimilarity measures for time series classification, pp. 82–88 (October 2013). https://doi.org/10.1109/BRACIS.2013.22

  18. Hoi, S.C., Sahoo, D., Lu, J., Zhao, P.: Online learning: a comprehensive survey. arXiv preprint arXiv:1802.02871 (2018)

  19. Jiang, G., Wang, W., Zhang, W.: A novel distance measure for time series: maximum shifting correlation distance. Pattern Recognit. Lett. 117, 58–65 (2019)

    Article  Google Scholar 

  20. Khan, S.S., Madden, M.G.: One-class classification: taxonomy of study and review of techniques. Knowl. Eng. Rev. 29(3), 345–374 (2014)

    Article  Google Scholar 

  21. Kuncheva, L.I., Bezdek, J.C.: Nearest prototype classification: clustering, genetic algorithms, or random search? IEEE Trans. Syst. Man Cybern. Part C (Appl. Rev.) 28(1), 160–164 (1998)

    Article  Google Scholar 

  22. Lin, J., Williamson, S., Borne, K., DeBarr, D.: Pattern recognition in time series. Adv. Mach. Learn. Data Min. Astron. 1(617–645), 3 (2012)

    Google Scholar 

  23. Mauceri, S., Sweeney, J., McDermott, J.: Dissimilarity-based representations for one-class classification on time series. Pattern Recognit. 100, 107122 (2020)

    Article  Google Scholar 

  24. Mazhelis, O.: One-class classifiers: a review and analysis of suitability in the context of mobile-masquerader detection. S. Afr. Comput. J. 2006(36), 29–48 (2006)

    Google Scholar 

  25. Mori, U., Mendiburu, A., Lozano, J.A.: Distance measures for time series in R: The TSdist package. R J. 8(2), 451 (2016)

    Article  Google Scholar 

  26. Nakano, K., Chakraborty, B.: Effect of data representation for time series classification–a comparative study and a new proposal. Mach. Learn. Knowl. Extr. 1(4), 1100–1120 (2019)

    Article  Google Scholar 

  27. Pękalska, E., Duin, R.P., Paclík, P.: Prototype selection for dissimilarity-based classifiers. Pattern Recognit. 39(2), 189–208 (2006)

    Article  Google Scholar 

  28. Peng, K., Leung, V.C., Huang, Q.: Clustering approach based on mini batch kmeans for intrusion detection system over big data. IEEE Access 6, 11897–11906 (2018)

    Article  Google Scholar 

  29. Pimentel, M.A., Clifton, D.A., Clifton, L., Tarassenko, L.: A review of novelty detection. Signal Process. 99, 215–249 (2014)

    Article  Google Scholar 

  30. Rakthanmanon, T., et al.: Searching and mining trillions of time series subsequences under dynamic time warping. In: Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 262–270 (2012)

    Google Scholar 

  31. Rodríguez, C.E., Núñez-Antonio, G., Escarela, G.: A Bayesian mixture model for clustering circular data. Comput. Stat. Data Anal. 143, 106842 (2020)

    Article  MathSciNet  Google Scholar 

  32. Serra, J., Arcos, J.L.: An empirical evaluation of similarity measures for time series classification. Knowl.-Based Syst. 67, 305–314 (2014)

    Article  Google Scholar 

  33. Sharma, A., Kumar, A., Pandey, A.K., Singh, R.: Time series data representation and dimensionality reduction techniques. In: Johri, P., Verma, J.K., Paul, S. (eds.) Applications of Machine Learning. AIS, pp. 267–284. Springer, Singapore (2020). https://doi.org/10.1007/978-981-15-3357-0_18

    Chapter  Google Scholar 

  34. Sonbhadra, S.K., Agarwal, S., Nagabhushan, P.: Early-stage covid-19 diagnosis in presence of limited posteroanterior chest x-ray images via novel pinball-OCSVM. arXiv preprint arXiv:2010.08115 (2020)

  35. Sonbhadra, S.K., Agarwal, S., Nagabhushan, P.: Target specific mining of covid-19 scholarly articles using one-class approach. Chaos Solitons Fractals 140, 110155 (2020)

    Article  MathSciNet  Google Scholar 

  36. Stefan, A., Athitsos, V., Das, G.: The move-split-merge metric for time series. IEEE Trans. Knowl. Data Eng. 25(6), 1425–1438 (2012)

    Article  Google Scholar 

  37. Triguero, I., Derrac, J., Garcia, S., Herrera, F.: A taxonomy and experimental study on prototype generation for nearest neighbor classification. IEEE Trans. Syst. Man Cybern. Part C (Appl. Rev.) 42(1), 86–100 (2011)

    Article  Google Scholar 

  38. Verleysen, M., François, D.: The curse of dimensionality in data mining and time series prediction. In: Cabestany, J., Prieto, A., Sandoval, F. (eds.) IWANN 2005. LNCS, vol. 3512, pp. 758–770. Springer, Heidelberg (2005). https://doi.org/10.1007/11494669_93

    Chapter  Google Scholar 

  39. Wang, X., Mueen, A., Ding, H., Trajcevski, G., Scheuermann, P., Keogh, E.: Experimental comparison of representation methods and distance measures for time series data. Data Min. Knowl. Discov. 26(2), 275–309 (2013)

    Article  MathSciNet  Google Scholar 

  40. Wilson, S.J.: Data representation for time series data mining: time domain approaches. Wiley Interdiscip. Rev.: Comput. Stat. 9(1), e1392 (2017)

    Article  MathSciNet  Google Scholar 

  41. Yang, Q., Wu, X.: 10 challenging problems in data mining research. Int. J. Inf. Technol. Decis. Mak. 5(04), 597–604 (2006)

    Article  Google Scholar 

  42. Yin, C., Zhang, S., Wang, J., Xiong, N.N.: Anomaly detection based on convolutional recurrent autoencoder for IoT time series. IEEE Trans. Syst. Man Cybern.: Syst. (2020)

    Google Scholar 

  43. Zhang, K., Gu, X.: An affinity propagation clustering algorithm for mixed numeric and categorical datasets. Math. Probl. Eng. 2014, 1–8 (2014)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sanjay Kumar Sonbhadra .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Sonbhadra, S.K., Agarwal, S., Nagabhushan, P. (2021). Target Class Supervised Sample Length and Training Sample Reduction of Univariate Time Series. In: Fujita, H., Selamat, A., Lin, J.CW., Ali, M. (eds) Advances and Trends in Artificial Intelligence. From Theory to Practice. IEA/AIE 2021. Lecture Notes in Computer Science(), vol 12799. Springer, Cham. https://doi.org/10.1007/978-3-030-79463-7_51

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-79463-7_51

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-79462-0

  • Online ISBN: 978-3-030-79463-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics