Skip to main content
Log in

A compression pipeline for one-stage object detection model

  • Original Research Paper
  • Published:
Journal of Real-Time Image Processing Aims and scope Submit manuscript

A Correction to this article was published on 10 February 2021

This article has been updated

Abstract

Deep neural networks (DNNs) have strong fitting ability on a variety of computer vision tasks, but they also require intensive computing power and large storage space, which are not always available in portable smart devices. Although a lot of studies have contributed to the compression of image classification networks, there are few model compression algorithms for object detection models. In this paper, we propose a general compression pipeline for one-stage object detection networks to meet the real-time requirements. Firstly, we propose a softer pruning strategy on the backbone to reduce the number of filters. Compared with original direct pruning, our method can maintain the integrity of network structure and reduce the drop of accuracy. Secondly, we transfer the knowledge of the original model to the small model by knowledge distillation to reduce the accuracy drop caused by pruning. Finally, as edge devices are more suitable for integer operations, we further transform the 32-bit floating point model into the 8-bit integer model through quantization. With this pipeline, the model size and inference time are compressed to 10% or less of the original, while the mAP is only reduced by 2.5% or less. We verified that performance of the compression pipeline on the Pascal VOC dataset.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Change history

References

  1. Xu, K., Wang, X., Liu, X., Cao, C., Li, H., Peng, H., Wang, D.: A dedicated hardware accelerator for real-time acceleration of yolov2. J. Real Time Image Process. (2020)

  2. Doménech-Asensi, G., Zapata-Pérez, J., Ruiz-Merino, R., Lopez-Alcantud, J.A., Díaz-Madrid, J.Á., Brea, V.M., López, P.: All-hardware sift implementation for real-time vga images feature extraction. J. Real Time Image Process. 17(2), 371–382 (2020)

    Article  Google Scholar 

  3. Zhao, Z., Kuang, X., Zhu, Y., Liang, Y., Xuan, Y.: Combined kernel for fast gpu computation of zernike moments. J. Real Time Image Process. (2020)

  4. Denil, M., Shakibi, B., Dinh, L., Ranzato, M., De Freitas, N.: Predicting parameters in deep learning. In: Advances in Neural Information Processing Systems, pp. 2148–2156 (2013)

  5. Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.-Y., Berg, A.C.: Ssd: single shot multibox detector. In: European conference on computer vision, pp. 21–37. Springer (2016)

  6. Redmon, J., Farhadi, A.: Yolov3: an incremental improvement. arXiv preprint. arXiv:1804.02767 (2018)

  7. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint. arXiv:1409.1556 (2014)

  8. Liu, Z., Li, J., Shen, Z., Huang, G., Yan, S., Zhang, C.: Learning efficient convolutional networks through network slimming. In: Proceedings of the IEEE international conference on computer vision, pp. 2736–2744 (2017)

  9. Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. arXiv preprint. arXiv:1502.03167 (2015)

  10. Zhang, Y., Xiang, T., Hospedales, T.M., Lu, H.: Deep mutual learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 4320–4328 (2018)

  11. Fu, C.-Y., Liu, W., Ranga, A., Tyagi, A., Berg, A.C.: Dssd: deconvolutional single shot detector. arXiv preprint. arXiv:1701.06659 (2017)

  12. Zhang, S., Wen, L., Bian, X., Lei, Z., Li, S.Z.: Single-shot refinement neural network for object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 4203–4212 (2018)

  13. Zhao, Q., Sheng, T., Wang, Y., Tang, Z., Chen, Y., Cai, L., Ling, H.: M2det: a single-shot object detector based on multi-level feature pyramid network. Proc. AAAI Conf. Artif. Intell. 33, 9259–9266 (2019)

    Google Scholar 

  14. Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 580–587 (2014)

  15. Girshick, R.: Fast r-cnn. In: Proceedings of the IEEE international conference on computer vision, pp. 1440–1448 (2015)

  16. Ren, S., He, K., Girshick, R., Sun, J.: Faster r-cnn: Towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39, 1137–1149 (2016)

    Article  Google Scholar 

  17. Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: Unified, real-time object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 779–788 (2016)

  18. Redmon, J., Farhadi, A.: Yolo9000: better, faster, stronger. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 7263–7271 (2017)

  19. Han, S., Pool, J., Tran, J., Dally, W.: Learning both weights and connections for efficient neural network. In: Advances in Neural Information Processing Systems, pp. 1135–1143 (2015)

  20. Molchanov, P., Tyree, S., Karras, T., Aila, T., Kautz, J.: Pruning convolutional neural networks for resource efficient inference. arXiv preprint. arXiv:1611.06440 (2016)

  21. Li, H., Kadav, A., Durdanovic, I., Samet, H., Graf, H.P.: Pruning filters for efficient convnets. arXiv preprint. arXiv:1608.08710 (2016)

  22. Hu, H., Peng, R., Tai, Y.-W., Tang, C.-K.: Network trimming: a data-driven neuron pruning approach towards efficient deep architectures. arXiv preprint. arXiv:1607.03250 (2016)

  23. Wen, W., Wu, C., Wang, Y., Chen, Y., Li, H.: Learning structured sparsity in deep neural networks. In: Advances in Neural Information Processing Systems, pp. 2074–2082 (2016)

  24. He, Y., Liu, P., Wang, Z., Hu, Z., Yang, Y.: Filter pruning via geometric median for deep convolutional neural networks acceleration. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 4340–4349 (2019)

  25. Hinton, G., Vinyals, O., Dean, J.: Distilling the knowledge in a neural network. arXiv preprint. arXiv:1503.02531 (2015)

  26. Romero, A., Ballas, N., Kahou, S.E., Chassang, A., Gatta, C., Bengio, Y.: Fitnets: hints for thin deep nets. arXiv preprint. arXiv:1412.6550 (2014)

  27. Zagoruyko, S., Komodakis, N.: Paying more attention to attention: Improving the performance of convolutional neural networks via attention transfer. arXiv preprint. arXiv:1612.03928 (2016)

  28. Xu, Z., Hsu, Y.-C., Huang, J.: Learning loss for knowledge distillation with conditional adversarial networks. arXiv preprint. arXiv:1709.00513 (2017)

  29. Yim, J., Joo, D., Bae, J., Kim, J.: A gift from knowledge distillation: fast optimization, network minimization and transfer learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4133–4141 (2017)

  30. Vanhoucke, V., Senior, A., Mao, M.Z.: Improving the speed of neural networks on cpus. In: Proceedings of the deep learning and unsupervised feature learning NIPS workshop. (2011)

  31. Jacob, B., Kligys, S., Chen, B., Zhu, M., Tang, M., Howard, A., Adam, H., Kalenichenko, D.: Quantization and training of neural networks for efficient integer-arithmetic-only inference. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 2704–2713 (2018)

  32. Krishnamoorthi, R.: Quantizing deep convolutional networks for efficient inference: a whitepaper. arXiv preprint. arXiv:1806.08342 (2018)

  33. Kullback, S., Leibler, R.A.: On information and sufficiency. Ann. Math. Stat. 22(1), 79–86 (1951)

    Article  MathSciNet  Google Scholar 

  34. Chen, G., Choi, W., Yu, X., Han, T., Chandraker, M.: Learning efficient object detection models with knowledge distillation. In: Advances in Neural Information Processing Systems, pp. 742–751 (2017)

  35. Liu, Z., Sun, M., Zhou, T., Huang, G., Darrell, T.: Rethinking the value of network pruning. arXiv preprint. arXiv:1810.05270 (2018)

Download references

Author information

Authors and Affiliations

Authors

Contributions

ZL, YS: the authors contribute equally.

Corresponding author

Correspondence to Lei Xie.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Li, Z., Sun, Y., Tian, G. et al. A compression pipeline for one-stage object detection model. J Real-Time Image Proc 18, 1949–1962 (2021). https://doi.org/10.1007/s11554-020-01053-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11554-020-01053-z

Keywords

Navigation