Abstract
Although deep learning has achieved great success on time series classification, two issues are unsolved. First, existing methods mainly extract features in the single domain only, which means that useful information in the specific domain cannot be used. Second, multi-domain learning usually leads to an increase in the size of the model which makes it difficult to deploy on mobile devices. In this this study, a lightweight double-branch model, called Time Frequency Knowledge Reception Network (TFKR-Net) is proposed to simultaneously fuse information from the time and frequency domains. Instead of directly merging knowledge from the teacher models pretrained in different domains, TFKR-Net independently distills knowledge from the teacher models in the time and frequency domains, which helps maintain knowledge diversity. Experimental results on the UCR (University of California, Riverside) archive demonstrate that the TFKR-Net significantly reduces the model size and improves computational efficiency with a little performance loss in classification accuracy.









Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Selvin S, Vinayakumar R, Gopalakrishnan E, Menon VK, Soman K (2017) Stock price prediction using lstm, rnn and cnn-sliding window model. In: 2017 International conference on advances in computing, communications and informatics (icacci). IEEE, pp 1643–1647
Gul M, Catbas FN (2009) Statistical pattern recognition for structural health monitoring using time series modeling: Theory and experimental verifications. Mech Syst Signal Process 23(7):2192–2204
Fiecas M, Leng C, Liu W, Yu Y, et al. (2019) Spectral analysis of high-dimensional time series. Electron J Stat 13(2):4079–4101
Müller M (2007) Dynamic time warping. Inf Retr Music Motion:69–84
Middlehurst M, Vickers W, Bagnall A (2019) Scalable dictionary classifiers for time series classification. In: International conference on intelligent data engineering and automated learning. Springer, pp 11–19
Ye L, Keogh E (2009) Time series shapelets: a new primitive for data mining. In: Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, pp 947–956
Deng H, Runger G, Tuv E, Vladimir M (2013) A time series forest for classification and feature extraction. Inf Sci 239:142–153
Wang Z, Yan W, Oates T (2017) Time series classification from scratch with deep neural networks: a strong baseline. In: 2017 International joint conference on neural networks (IJCNN). IEEE, pp 1578–1585
Fawaz HI, Lucas B, Forestier G, Pelletier C, Schmidt DF, Weber J, Webb GI, Idoumghar L, Muller P-A, Petitjean F (2020) Inceptiontime: Finding alexnet for time series classification. Data Min Knowl Disc 34(6):1936–1962
Zhang Y, Hou Y, OuYang K, Zhou S (2022) Multi-scale signed recurrence plot based time series classification using inception architectural networks. Pattern Recogn 123:108385
Lines J, Taylor S, Bagnall A (2018) Time series classification with hive-cote: The hierarchical vote collective of transformation-based ensembles. ACM Trans Knowl Discov Data 12:5
Yuan B, Wang C, Jiang F, Long M, Philip SY, Liu Y (2019) Waveletfcnn: A deep time series classification model for wind turbine blade icing detection
Wang J, Wang Z, Li J, Wu J (2018) Multilevel wavelet decomposition network for interpretable time series analysis. In: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp 2437–2446
Cui Z, Chen W, Chen Y (2016) Multi-scale convolutional neural networks for time series classification. arXiv:1603.06995
Mahmud T, Sayyed AS, Fattah SA, Kung S-Y (2020) A novel multi-stage training approach for human activity recognition from multimodal wearable sensor data using deep neural network. IEEE Sens J 21(2):1715–1726
Hinton G, Vinyals O, Dean J (2015) Distilling the knowledge in a neural network. Stat 1050:9
Romero A, Ballas N, Kahou SE, Chassang A, Bengio Y (2015) Fitnets: Hints for thin deep nets. In: ICLR
Park W, Kim D, Lu Y, Cho M (2019) Relational knowledge distillation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 3967–3976
Mirzadeh SI, Farajtabar M, Li A, Levine N, Matsukawa A, Ghasemzadeh H (2020) Improved knowledge distillation via teacher assistant. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol 34, pp 5191–5198
Bagnall A, Lines J, Hills J, Bostrom A (2015) Time-series classification with cote: the collective of transformation-based ensembles. IEEE Trans Knowl Data Eng 27(9):2522–2535
Mohammad Y, Matsumoto K, Hoashi K (2018) Deep feature learning and selection for activity recognition. In: Proceedings of the 33rd Annual ACM Symposium on Applied Computing, pp 930–939
Chebotar Y, Waters A (2016) Distilling knowledge from ensembles of neural networks for speech recognition. In: Interspeech, pp 3439–3443
Markov K, Matsui T (2016) Robust speech recognition using generalized distillation framework. In: Interspeech, pp 2364–2368
Fukuda T, Suzuki M, Kurata G, Thomas S, Cui J, Ramabhadran B (2017) Efficient knowledge distillation from an ensemble of teachers. In: Interspeech, pp 3697–3701
Lan X, Zhu X, Gong S (2018) Knowledge distillation by on-the-fly native ensemble. In: NeurIPS
Tran L, Veeling BS, Roth K, Swiatkowski J, Dillon J V, Snoek J, Mandt S, Salimans T, Nowozin S, Jenatton R (2020) Hydra: Preserving ensemble diversity for model distillation. arXiv:2001.04694
Yuan F, Shou L, Pei J, Lin W, Gong M, Fu Y, Jiang D (2020) Reinforced multi-teacher selection for knowledge distillation. arXiv:2012.06048
Yuan L, Tay FE, Li G, Wang T, Feng J (2020) Revisiting knowledge distillation via label smoothing regularization. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 3903–3911
Dau HA, Bagnall A, Kamgar K, Yeh C.-C. M., Zhu Y, Gharghabi S, Ratanamahatana C A, Keogh E (2019) The ucr time series archive. IEEE/CAA J Autom Sin 6(6):1293–1305
Shifaz A, Pelletier C, Petitjean F, Webb G I (2020) Ts-chief: a scalable and accurate forest algorithm for time series classification. Data Min Knowl Disc 34(3):742–775
Dempster A, Petitjean F, Webb G I (2020) Rocket: exceptionally fast and accurate time series classification using random convolutional kernels. Data Min Knowl Disc 34(5):1454– 1495
Demšar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30
Acknowledgments
This work was supported by the National Natural Science Foundation of China under Grant No.61903373 and the National Social Science Fund of China under Grant No.18CZX017.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Ouyang, K., Hou, Y., Zhang, Y. et al. Knowledge transfer via distillation from time and frequency domain for time series classification. Appl Intell 53, 1505–1516 (2023). https://doi.org/10.1007/s10489-022-03485-5
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-022-03485-5