Skip to main content
Log in

Knowledge transfer via distillation from time and frequency domain for time series classification

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Although deep learning has achieved great success on time series classification, two issues are unsolved. First, existing methods mainly extract features in the single domain only, which means that useful information in the specific domain cannot be used. Second, multi-domain learning usually leads to an increase in the size of the model which makes it difficult to deploy on mobile devices. In this this study, a lightweight double-branch model, called Time Frequency Knowledge Reception Network (TFKR-Net) is proposed to simultaneously fuse information from the time and frequency domains. Instead of directly merging knowledge from the teacher models pretrained in different domains, TFKR-Net independently distills knowledge from the teacher models in the time and frequency domains, which helps maintain knowledge diversity. Experimental results on the UCR (University of California, Riverside) archive demonstrate that the TFKR-Net significantly reduces the model size and improves computational efficiency with a little performance loss in classification accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  1. Selvin S, Vinayakumar R, Gopalakrishnan E, Menon VK, Soman K (2017) Stock price prediction using lstm, rnn and cnn-sliding window model. In: 2017 International conference on advances in computing, communications and informatics (icacci). IEEE, pp 1643–1647

  2. Gul M, Catbas FN (2009) Statistical pattern recognition for structural health monitoring using time series modeling: Theory and experimental verifications. Mech Syst Signal Process 23(7):2192–2204

    Article  Google Scholar 

  3. Fiecas M, Leng C, Liu W, Yu Y, et al. (2019) Spectral analysis of high-dimensional time series. Electron J Stat 13(2):4079–4101

    Article  MathSciNet  MATH  Google Scholar 

  4. Müller M (2007) Dynamic time warping. Inf Retr Music Motion:69–84

  5. Middlehurst M, Vickers W, Bagnall A (2019) Scalable dictionary classifiers for time series classification. In: International conference on intelligent data engineering and automated learning. Springer, pp 11–19

  6. Ye L, Keogh E (2009) Time series shapelets: a new primitive for data mining. In: Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, pp 947–956

  7. Deng H, Runger G, Tuv E, Vladimir M (2013) A time series forest for classification and feature extraction. Inf Sci 239:142–153

    Article  MathSciNet  MATH  Google Scholar 

  8. Wang Z, Yan W, Oates T (2017) Time series classification from scratch with deep neural networks: a strong baseline. In: 2017 International joint conference on neural networks (IJCNN). IEEE, pp 1578–1585

  9. Fawaz HI, Lucas B, Forestier G, Pelletier C, Schmidt DF, Weber J, Webb GI, Idoumghar L, Muller P-A, Petitjean F (2020) Inceptiontime: Finding alexnet for time series classification. Data Min Knowl Disc 34(6):1936–1962

    Article  MathSciNet  Google Scholar 

  10. Zhang Y, Hou Y, OuYang K, Zhou S (2022) Multi-scale signed recurrence plot based time series classification using inception architectural networks. Pattern Recogn 123:108385

  11. Lines J, Taylor S, Bagnall A (2018) Time series classification with hive-cote: The hierarchical vote collective of transformation-based ensembles. ACM Trans Knowl Discov Data 12:5

    Article  Google Scholar 

  12. Yuan B, Wang C, Jiang F, Long M, Philip SY, Liu Y (2019) Waveletfcnn: A deep time series classification model for wind turbine blade icing detection

  13. Wang J, Wang Z, Li J, Wu J (2018) Multilevel wavelet decomposition network for interpretable time series analysis. In: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp 2437–2446

  14. Cui Z, Chen W, Chen Y (2016) Multi-scale convolutional neural networks for time series classification. arXiv:1603.06995

  15. Mahmud T, Sayyed AS, Fattah SA, Kung S-Y (2020) A novel multi-stage training approach for human activity recognition from multimodal wearable sensor data using deep neural network. IEEE Sens J 21(2):1715–1726

    Article  Google Scholar 

  16. Hinton G, Vinyals O, Dean J (2015) Distilling the knowledge in a neural network. Stat 1050:9

    Google Scholar 

  17. Romero A, Ballas N, Kahou SE, Chassang A, Bengio Y (2015) Fitnets: Hints for thin deep nets. In: ICLR

  18. Park W, Kim D, Lu Y, Cho M (2019) Relational knowledge distillation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 3967–3976

  19. Mirzadeh SI, Farajtabar M, Li A, Levine N, Matsukawa A, Ghasemzadeh H (2020) Improved knowledge distillation via teacher assistant. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol 34, pp 5191–5198

  20. Bagnall A, Lines J, Hills J, Bostrom A (2015) Time-series classification with cote: the collective of transformation-based ensembles. IEEE Trans Knowl Data Eng 27(9):2522–2535

    Article  Google Scholar 

  21. Mohammad Y, Matsumoto K, Hoashi K (2018) Deep feature learning and selection for activity recognition. In: Proceedings of the 33rd Annual ACM Symposium on Applied Computing, pp 930–939

  22. Chebotar Y, Waters A (2016) Distilling knowledge from ensembles of neural networks for speech recognition. In: Interspeech, pp 3439–3443

  23. Markov K, Matsui T (2016) Robust speech recognition using generalized distillation framework. In: Interspeech, pp 2364–2368

  24. Fukuda T, Suzuki M, Kurata G, Thomas S, Cui J, Ramabhadran B (2017) Efficient knowledge distillation from an ensemble of teachers. In: Interspeech, pp 3697–3701

  25. Lan X, Zhu X, Gong S (2018) Knowledge distillation by on-the-fly native ensemble. In: NeurIPS

  26. Tran L, Veeling BS, Roth K, Swiatkowski J, Dillon J V, Snoek J, Mandt S, Salimans T, Nowozin S, Jenatton R (2020) Hydra: Preserving ensemble diversity for model distillation. arXiv:2001.04694

  27. Yuan F, Shou L, Pei J, Lin W, Gong M, Fu Y, Jiang D (2020) Reinforced multi-teacher selection for knowledge distillation. arXiv:2012.06048

  28. Yuan L, Tay FE, Li G, Wang T, Feng J (2020) Revisiting knowledge distillation via label smoothing regularization. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 3903–3911

  29. Dau HA, Bagnall A, Kamgar K, Yeh C.-C. M., Zhu Y, Gharghabi S, Ratanamahatana C A, Keogh E (2019) The ucr time series archive. IEEE/CAA J Autom Sin 6(6):1293–1305

    Article  Google Scholar 

  30. Shifaz A, Pelletier C, Petitjean F, Webb G I (2020) Ts-chief: a scalable and accurate forest algorithm for time series classification. Data Min Knowl Disc 34(3):742–775

    Article  MathSciNet  MATH  Google Scholar 

  31. Dempster A, Petitjean F, Webb G I (2020) Rocket: exceptionally fast and accurate time series classification using random convolutional kernels. Data Min Knowl Disc 34(5):1454– 1495

    Article  MathSciNet  MATH  Google Scholar 

  32. Demšar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30

    MathSciNet  MATH  Google Scholar 

Download references

Acknowledgments

This work was supported by the National Natural Science Foundation of China under Grant No.61903373 and the National Social Science Fund of China under Grant No.18CZX017.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yi Hou.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ouyang, K., Hou, Y., Zhang, Y. et al. Knowledge transfer via distillation from time and frequency domain for time series classification. Appl Intell 53, 1505–1516 (2023). https://doi.org/10.1007/s10489-022-03485-5

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-022-03485-5

Keywords

Navigation