A compression pipeline for one-stage object detection model

Li, Zhishan; Sun, Yiran; Tian, Guanzhong; Xie, Lei; Liu, Yong; Su, Hongye; He, Yifan

doi:10.1007/s11554-020-01053-z

A compression pipeline for one-stage object detection model

Original Research Paper
Published: 21 January 2021

Volume 18, pages 1949–1962, (2021)
Cite this article

Journal of Real-Time Image Processing Aims and scope Submit manuscript

Zhishan Li¹^na1,
Yiran Sun¹^na1,
Guanzhong Tian¹,
Lei Xie¹,
Yong Liu¹,
Hongye Su¹ &
…
Yifan He^2,3

594 Accesses
11 Citations
Explore all metrics

A Correction to this article was published on 10 February 2021

This article has been updated

Abstract

Deep neural networks (DNNs) have strong fitting ability on a variety of computer vision tasks, but they also require intensive computing power and large storage space, which are not always available in portable smart devices. Although a lot of studies have contributed to the compression of image classification networks, there are few model compression algorithms for object detection models. In this paper, we propose a general compression pipeline for one-stage object detection networks to meet the real-time requirements. Firstly, we propose a softer pruning strategy on the backbone to reduce the number of filters. Compared with original direct pruning, our method can maintain the integrity of network structure and reduce the drop of accuracy. Secondly, we transfer the knowledge of the original model to the small model by knowledge distillation to reduce the accuracy drop caused by pruning. Finally, as edge devices are more suitable for integer operations, we further transform the 32-bit floating point model into the 8-bit integer model through quantization. With this pipeline, the model size and inference time are compressed to 10% or less of the original, while the mAP is only reduced by 2.5% or less. We verified that performance of the compression pipeline on the Pascal VOC dataset.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A survey of model compression strategies for object detection

Article Open access 02 November 2023

Efficient convolutional neural networks and network compression methods for object detection: a survey

Article 20 June 2023

Channel Pruning and Quantization-Based Learning for Object Detection with Computing Source Limited Application

Change history

10 February 2021
A Correction to this paper has been published: https://doi.org/10.1007/s11554-021-01082-2

References

Xu, K., Wang, X., Liu, X., Cao, C., Li, H., Peng, H., Wang, D.: A dedicated hardware accelerator for real-time acceleration of yolov2. J. Real Time Image Process. (2020)
Doménech-Asensi, G., Zapata-Pérez, J., Ruiz-Merino, R., Lopez-Alcantud, J.A., Díaz-Madrid, J.Á., Brea, V.M., López, P.: All-hardware sift implementation for real-time vga images feature extraction. J. Real Time Image Process. 17(2), 371–382 (2020)
Article Google Scholar
Zhao, Z., Kuang, X., Zhu, Y., Liang, Y., Xuan, Y.: Combined kernel for fast gpu computation of zernike moments. J. Real Time Image Process. (2020)
Denil, M., Shakibi, B., Dinh, L., Ranzato, M., De Freitas, N.: Predicting parameters in deep learning. In: Advances in Neural Information Processing Systems, pp. 2148–2156 (2013)
Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.-Y., Berg, A.C.: Ssd: single shot multibox detector. In: European conference on computer vision, pp. 21–37. Springer (2016)
Redmon, J., Farhadi, A.: Yolov3: an incremental improvement. arXiv preprint. arXiv:1804.02767 (2018)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint. arXiv:1409.1556 (2014)
Liu, Z., Li, J., Shen, Z., Huang, G., Yan, S., Zhang, C.: Learning efficient convolutional networks through network slimming. In: Proceedings of the IEEE international conference on computer vision, pp. 2736–2744 (2017)
Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. arXiv preprint. arXiv:1502.03167 (2015)
Zhang, Y., Xiang, T., Hospedales, T.M., Lu, H.: Deep mutual learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 4320–4328 (2018)
Fu, C.-Y., Liu, W., Ranga, A., Tyagi, A., Berg, A.C.: Dssd: deconvolutional single shot detector. arXiv preprint. arXiv:1701.06659 (2017)
Zhang, S., Wen, L., Bian, X., Lei, Z., Li, S.Z.: Single-shot refinement neural network for object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 4203–4212 (2018)
Zhao, Q., Sheng, T., Wang, Y., Tang, Z., Chen, Y., Cai, L., Ling, H.: M2det: a single-shot object detector based on multi-level feature pyramid network. Proc. AAAI Conf. Artif. Intell. 33, 9259–9266 (2019)
Google Scholar
Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 580–587 (2014)
Girshick, R.: Fast r-cnn. In: Proceedings of the IEEE international conference on computer vision, pp. 1440–1448 (2015)
Ren, S., He, K., Girshick, R., Sun, J.: Faster r-cnn: Towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39, 1137–1149 (2016)
Article Google Scholar
Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: Unified, real-time object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 779–788 (2016)
Redmon, J., Farhadi, A.: Yolo9000: better, faster, stronger. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 7263–7271 (2017)
Han, S., Pool, J., Tran, J., Dally, W.: Learning both weights and connections for efficient neural network. In: Advances in Neural Information Processing Systems, pp. 1135–1143 (2015)
Molchanov, P., Tyree, S., Karras, T., Aila, T., Kautz, J.: Pruning convolutional neural networks for resource efficient inference. arXiv preprint. arXiv:1611.06440 (2016)
Li, H., Kadav, A., Durdanovic, I., Samet, H., Graf, H.P.: Pruning filters for efficient convnets. arXiv preprint. arXiv:1608.08710 (2016)
Hu, H., Peng, R., Tai, Y.-W., Tang, C.-K.: Network trimming: a data-driven neuron pruning approach towards efficient deep architectures. arXiv preprint. arXiv:1607.03250 (2016)
Wen, W., Wu, C., Wang, Y., Chen, Y., Li, H.: Learning structured sparsity in deep neural networks. In: Advances in Neural Information Processing Systems, pp. 2074–2082 (2016)
He, Y., Liu, P., Wang, Z., Hu, Z., Yang, Y.: Filter pruning via geometric median for deep convolutional neural networks acceleration. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 4340–4349 (2019)
Hinton, G., Vinyals, O., Dean, J.: Distilling the knowledge in a neural network. arXiv preprint. arXiv:1503.02531 (2015)
Romero, A., Ballas, N., Kahou, S.E., Chassang, A., Gatta, C., Bengio, Y.: Fitnets: hints for thin deep nets. arXiv preprint. arXiv:1412.6550 (2014)
Zagoruyko, S., Komodakis, N.: Paying more attention to attention: Improving the performance of convolutional neural networks via attention transfer. arXiv preprint. arXiv:1612.03928 (2016)
Xu, Z., Hsu, Y.-C., Huang, J.: Learning loss for knowledge distillation with conditional adversarial networks. arXiv preprint. arXiv:1709.00513 (2017)
Yim, J., Joo, D., Bae, J., Kim, J.: A gift from knowledge distillation: fast optimization, network minimization and transfer learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4133–4141 (2017)
Vanhoucke, V., Senior, A., Mao, M.Z.: Improving the speed of neural networks on cpus. In: Proceedings of the deep learning and unsupervised feature learning NIPS workshop. (2011)
Jacob, B., Kligys, S., Chen, B., Zhu, M., Tang, M., Howard, A., Adam, H., Kalenichenko, D.: Quantization and training of neural networks for efficient integer-arithmetic-only inference. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 2704–2713 (2018)
Krishnamoorthi, R.: Quantizing deep convolutional networks for efficient inference: a whitepaper. arXiv preprint. arXiv:1806.08342 (2018)
Kullback, S., Leibler, R.A.: On information and sufficiency. Ann. Math. Stat. 22(1), 79–86 (1951)
Article MathSciNet Google Scholar
Chen, G., Choi, W., Yu, X., Han, T., Chandraker, M.: Learning efficient object detection models with knowledge distillation. In: Advances in Neural Information Processing Systems, pp. 742–751 (2017)
Liu, Z., Sun, M., Zhou, T., Huang, G., Darrell, T.: Rethinking the value of network pruning. arXiv preprint. arXiv:1810.05270 (2018)

Download references

Author information

Zhishan Li and Yiran Sun contribute equally.

Authors and Affiliations

State Key Laboratory of Industrial Control Technology and Institute of Cyber-systems and Control, Zhejiang University, Hangzhou, 310027, China
Zhishan Li, Yiran Sun, Guanzhong Tian, Lei Xie, Yong Liu & Hongye Su
Reconova Technologies Co., Ltd, Xiamen, 361008, China
Yifan He
Institute of Intelligence Science and Engineering, Shenzhen Polytechnic, Shenzhen, 518055, China
Yifan He

Authors

Zhishan Li
View author publications
You can also search for this author in PubMed Google Scholar
Yiran Sun
View author publications
You can also search for this author in PubMed Google Scholar
Guanzhong Tian
View author publications
You can also search for this author in PubMed Google Scholar
Lei Xie
View author publications
You can also search for this author in PubMed Google Scholar
Yong Liu
View author publications
You can also search for this author in PubMed Google Scholar
Hongye Su
View author publications
You can also search for this author in PubMed Google Scholar
Yifan He
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

ZL, YS: the authors contribute equally.

Corresponding author

Correspondence to Lei Xie.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Li, Z., Sun, Y., Tian, G. et al. A compression pipeline for one-stage object detection model. J Real-Time Image Proc 18, 1949–1962 (2021). https://doi.org/10.1007/s11554-020-01053-z

Download citation

Received: 17 June 2020
Accepted: 25 November 2020
Published: 21 January 2021
Issue Date: December 2021
DOI: https://doi.org/10.1007/s11554-020-01053-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A compression pipeline for one-stage object detection model

Abstract

Access this article

Similar content being viewed by others

A survey of model compression strategies for object detection

Efficient convolutional neural networks and network compression methods for object detection: a survey

Channel Pruning and Quantization-Based Learning for Object Detection with Computing Source Limited Application

Change history

10 February 2021

References

Author information

Authors and Affiliations

Contributions

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A compression pipeline for one-stage object detection model

Abstract

Access this article

Similar content being viewed by others

A survey of model compression strategies for object detection

Efficient convolutional neural networks and network compression methods for object detection: a survey

Channel Pruning and Quantization-Based Learning for Object Detection with Computing Source Limited Application

Change history

10 February 2021

References

Author information

Authors and Affiliations

Contributions

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation