AFMPM: adaptive feature map pruning method based on feature distillation

Guo, Yufeng; Zhang, Weiwei; Wang, Junhuang; Ji, Ming; Zhen, Chenghui; Guo, Zhengzheng

doi:10.1007/s13042-023-01926-2

AFMPM: adaptive feature map pruning method based on feature distillation

Original Article
Published: 13 August 2023

Volume 15, pages 573–588, (2024)
Cite this article

International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Yufeng Guo¹,
Weiwei Zhang ORCID: orcid.org/0000-0002-7285-8714¹,
Junhuang Wang¹,
Ming Ji¹,
Chenghui Zhen² &
…
Zhengzheng Guo¹

225 Accesses
Explore all metrics

Abstract

Feature distillation is a technology that uses the middle layer feature map of the teacher network as knowledge to transfer to the students. The feature information not only reflects the image information but also covers the feature extraction ability of the teacher network. However, the existing feature distillation methods lack theoretical guidance for feature map evaluation and suffer from the mismatch of sizes between high-dimensional feature maps and low-dimensional feature maps, and poor information utilization. In this paper, we propose an Adaptive Feature Map Pruning Method (AFMPM) for feature distillation, which transforms the problem of feature map pruning into the problem of optimization so that the valid information of the feature map is retained to the maximum extent. AFMPM has achieved significant improvements in feature distillation, and the advanced and generalized nature of the method has been verified by conducting experiments on the teacher-student distillation framework and the self-distillation framework.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Semi-supervised transformable architecture search for feature distillation

Article 01 November 2022

Prime-Aware Adaptive Distillation

Using Less but Important Information for Feature Distillation

Data availability

The datasets generated during and analysed during the current study are available from the corresponding author on reasonable request.All data generated or analysed during this study are included in this published article.

References

Wang C, Zhang S, Song S et al (2022) Learn from the past: experience ensemble knowledge distillation. arXiv preprint https://arxiv.org/abs/2202.12488
Yao J, Zhang S, Yao Y et al (2022) Edge-cloud polarization and collaboration: a comprehensive survey for AI. IEEE Trans Knowl Data Eng 35:6866
Google Scholar
Liu Z, Sun M, Zhou T et al (2018) Rethinking the value of network pruning. arXiv preprint https://arxiv.org/abs/1810.0527
Wang D, Zhang S, Di Z et al (2022) A Novel Architecture Slimming Method for Network Pruning and Knowledge Distillation. arXiv preprint https://arxiv.org/abs/2202.10461
Yang C, An Z, Cai L et al (2022) Knowledge distillation using hierarchical self-supervision augmented distribution. IEEE Trans Neural Netw Learn Syst (TNNLS) 12:1–15
Google Scholar
Hinton G, Vinyals O, Dean J (2015) Distilling the knowledge in a neural network. arXiv preprint https://arxiv.org/abs/1503.02531
Romero A, Ballas N, Kahou SE et al (2015) FitNets: hints for thin deep nets. arXiv preprint https://arxiv.org/abs/1412.6550
Zagoruyko S, Komodakis N (2016) Paying more attention to attention: improving the performance of convolutional neural networks via attention transfer. arXiv preprint https://arxiv.org/abs/1612.03928
Yim J, Joo D, Bae J et al (2017) A gift from knowledge distillation: Fast optimization, network minimization and transfer learning. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp 7130–7138
Kim J, Park SU, Kwak N (2018) Paraphrasing complex network: network compression via factor transfer. Neural Inf Process Syst (NIPS) 31:2765–2774
Google Scholar
Heo B, Lee M, Yun S et al (2019) Knowledge transfer via distillation of activation boundaries formed by hidden neurons. In: AAAI Conference on Artificial Intelligence, pp 3779–3787
Song J, Chen Y, Ye J et al (2022) Spot-adaptive knowledge distillation. In: IEEE Trans Image Process, pp 3359–3370
Srinivas S, Fleuret F (2018) Knowledge transfer with jacobian matching. In: International Conference on Machine Learning (PMLR), pp 4723–4731
Zhang L, Song J, Gao A et al (2019) Be your own teacher: Improve the performance of convolutional neural networks via self distillation. In: IEEE/CVF International Conference on Computer Vision (ICCV), pp 3713–3722
Lin M, Ji R, Wang Y et al (2020) Hrank: Filter pruning using high-rank feature map. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp 1529–1538
Amik FR, Tasin AI, Ahmed S et al (2022) Dynamic Rectification Knowledge Distillation. arXiv preprint https://arxiv.org/abs/2201.11319
Krizhevsky A, Hinton G (2009) Learning multiple layers of features from tiny images. Handbook of systemic autoimmune diseases 1(4)
Park W, Kim D, Lu Y et al (2019) Relational knowledge distillation. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp 3967–3976
Tian Y, Krishnan D, Isola P (2019) Contrastive representation distillation. arXiv preprint https://arxiv.org/abs/1910.10699
Heo B, Kim J, Yun S et al (2019) A comprehensive overhaul of feature distillation. In: IEEE/CVF International Conference on Computer Vision (ICCV), pp. 1921–1930
Huang Z, Wang N (2017) Like what you like: knowledge distill via neuron selectivity transfer. arXiv preprint https://arxiv.org/abs/1707.01219
Passalis N, Tefas A (2018) Probabilistic knowledge transfer for deep representation learning. IEEE Trans Neural Netw Learn Syst (TNNLS) 32:2030–2039
Article MathSciNet Google Scholar
Ahn S, Hu SX, Damianou A et al (2019) Variational information distillation for knowledge transfer. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp 9163 – 917
Sun D, Yao A, Zhou A et al (2019) Deeply-supervised knowledge synergy. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp 6997–7006
Tung F, Mori G (2019) Similarity-preserving knowledge distillation. In: IEEE/CVF International Conference on Computer Vision (ICCV), pp 1365–1374
Lee CY, Xie S, Gallagher P et al (2015) Deeply-supervised nets. Artif Intell Stat PMLR 21:562–570
Google Scholar
Peng B, Jin X, Liu J et al (2019) Correlation congruence for knowledge distillation. In: IEEE/CVF International Conference on Computer Vision (ICCV), pp 5007–5016
Chen D, Mei JP, Zhang Y et al (2021) Cross-layer distillation with semantic calibration. In: AAAI Conference on Artificial Intelligence, pp 7028–7036
Zhang L, Shi Y, Shi Z et al (2020) Task-oriented feature distillation. Neural Inf Process Syst (NIPS) 33:14759–14771
Google Scholar
LeCun Y, Denker J, Solla S (1989) Optimal brain damage. Neural Inf Process Syst (NIPS) 598–605
Han S, Pool J, Tran J et al (2015) Learning both weights and connections for efficient neural network. Neural Inf Process Syst (NIPS) 1:1135–1143
Frankle J, Carbin M (2018) The lottery ticket hypothesis: finding sparse, trainable neural networks. arXiv preprint https://arxiv.org/abs/1803.03635
Frankle J, Dziugaite GK, Roy D et al (2020) Linear mode connectivity and the lottery ticket hypothesis. In: International Conference on Machine Learning (PMLR), pp 3259–3269
Ye J, Lu X, Lin Z et al (2018) Rethinking the smaller-norm-less-informative assumption in channel pruning of convolution layers. arXiv preprint https://arxiv.org/abs/1802.00124
Zhuang T, Zhang Z, Huang Y et al (2020) Neuron-level structured pruning using polarization regularizer. Neural Inf Process Syst (NIPS) 33:9865–9877
Google Scholar
Hu H, Peng R, Tai YW et al (2016) Network trimming: a data-driven neuron pruning approach towards efficient deep architectures. arXiv preprint https://arxiv.org/abs/1607.03250
Luo JH, Wu J (2017) An entropy-based pruning method for cnn compression. arXiv preprint arXiv:1706.05791
He Y, Liu P, Wang Z et al (2019) Filter pruning via geometric median for deep convolutional neural networks acceleration. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp 4340–4349
Wang X, Zheng Z, He Y et al (2023) Progressive local filter pruning for image retrieval acceleration. IEEE Trans Multimed 4:1–11
Google Scholar
Li G, Wang J, Shen HW et al (2020) Cnnpruner: pruning convolutional neural networks with visual analytics. IEEE Trans Vis Comput Gr 27:1364–1373
Article Google Scholar
Wang X, Zheng Z, He Y et al (2022) Soft person reidentification network pruning via blockwise adjacent filter decaying. IEEE Trans Cybern 52:13293–13307
Article Google Scholar

Download references

Funding

This work was supported by the Natural Science Foundation of China (Grant No. 61976098), Science and Technology Development Foundation of Quanzhou City (Grant No. 2020C067). The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Author information

Authors and Affiliations

College of Engineering, Huaqiao University, Quanzhou, 362021, China
Yufeng Guo, Weiwei Zhang, Junhuang Wang, Ming Ji & Zhengzheng Guo
College of Information Science and Engineering, Huaqiao University, Xiamen, 361021, China
Chenghui Zhen

Authors

Yufeng Guo
View author publications
You can also search for this author in PubMed Google Scholar
Weiwei Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Junhuang Wang
View author publications
You can also search for this author in PubMed Google Scholar
Ming Ji
View author publications
You can also search for this author in PubMed Google Scholar
Chenghui Zhen
View author publications
You can also search for this author in PubMed Google Scholar
Zhengzheng Guo
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Weiwei Zhang.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Guo, Y., Zhang, W., Wang, J. et al. AFMPM: adaptive feature map pruning method based on feature distillation. Int. J. Mach. Learn. & Cyber. 15, 573–588 (2024). https://doi.org/10.1007/s13042-023-01926-2

Download citation

Received: 18 January 2023
Accepted: 20 July 2023
Published: 13 August 2023
Issue Date: February 2024
DOI: https://doi.org/10.1007/s13042-023-01926-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

AFMPM: adaptive feature map pruning method based on feature distillation

Abstract

Access this article

Similar content being viewed by others

Semi-supervised transformable architecture search for feature distillation

Prime-Aware Adaptive Distillation

Using Less but Important Information for Feature Distillation

Data availability

References

Funding

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

AFMPM: adaptive feature map pruning method based on feature distillation

Abstract

Access this article

Similar content being viewed by others

Semi-supervised transformable architecture search for feature distillation

Prime-Aware Adaptive Distillation

Using Less but Important Information for Feature Distillation

Data availability

References

Funding

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation