Pruned-YOLO: Learning Efficient Object Detector Using Model Pruning

Zhang, Jiacheng; Wang, Pingyu; Zhao, Zhicheng; Su, Fei

doi:10.1007/978-3-030-86380-7_4

Pruned-YOLO: Learning Efficient Object Detector Using Model Pruning

Jiacheng Zhang^12,13,
Pingyu Wang^12,13,
Zhicheng Zhao^12,13 &
…
Fei Su^12,13

Conference paper
First Online: 07 September 2021

2773 Accesses
8 Citations

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 12894))

Abstract

Accurate real-time object detection plays a key role in various practical scenarios such as automatic driving and UAV surveillance. The memory limitation and poor computing power of edge devices hinder the deployment of high performance Convolutional Neural Networks (CNNs). Iterative channel pruning is an effective method to obtain lightweight networks. However, the channel importance measurement and iterative pruning in the existing methods are suboptimal. In this paper, to measure the channel importance, we simultaneously consider the scale factor of batch normalization (BN) and the kernel weight of convolutional layers. Besides, sparsity training and fine tuning are combined to simplify the pruning pipeline. Notably, the cosine decay of sparsity coefficient and soft mask strategy are used to optimize our compact model, i.e., Pruned-YOLOv3/v5, which is constructed via pruning YOLOv3/v5. The experimental results on the MS-COCO and VisDrone datasets show that the proposed model achieves a satisfactory balance between computational efficiency and detection accuracy.

This work is supported by Chinese National Natural Science Foundation (62076033, U1931202), and BUPT innovation and entrepreneurship support program (2021-YC-T026).

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

1.
https://github.com/ultralytics/yolov3, https://github.com/ultralytics/yolov5.

References

Azevedo, T., de Jong, R., Maji, P.: Stochastic-yolo: Efficient probabilistic object detection under dataset shifts. arXiv preprint arXiv:2009.02967 (2020)
Blakeney, C., Yan, Y., Zong, Z.: Is pruning compression? investigating pruning via network layer similarity. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. pp. 914–922 (2020)
Google Scholar
Bochkovskiy, A., Wang, C.Y., Liao, H.Y.M.: Yolov4: optimal speed and accuracy of object detection. arXiv preprint arXiv:2004.10934 (2020)
Carion, N., et al.: End-to-end object detection with transformers. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12346, pp. 213–229. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58452-8_13
Chapter Google Scholar
Chen, G., Choi, W., Yu, X., Han, T., Chandraker, M.: Learning efficient object detection models with knowledge distillation. In: Proceedings of the 31st International Conference on Neural Information Processing Systems, pp. 742–751 (2017)
Google Scholar
Chen, L.C., Papandreou, G., Schroff, F., Adam, H.: Rethinking atrous convolution for semantic image segmentation. arXiv preprint arXiv:1706.05587 (2017)
Choi, J., Chun, D., Kim, H., Lee, H.J.: Gaussian yolov3: An accurate and fast object detector using localization uncertainty for autonomous driving. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 502–511 (2019)
Google Scholar
Ding, X., Ding, G., Guo, Y., Han, J.: Centripetal sgd for pruning very deep convolutional networks with complicated structure. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4943–4953 (2019)
Google Scholar
Everingham, M., Van Gool, L., Williams, C.K., Winn, J., Zisserman, A.: The pascal visual object classes (voc) challenge. Int. J. Comput. Vision 88(2), 303–338 (2010)
Article Google Scholar
Guo, J., Ouyang, W., Xu, D.: Multi-dimensional pruning: a unified framework for model compression. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1508–1517 (2020)
Google Scholar
Han, S., Pool, J., Tran, J., Dally, W.J.: Learning both weights and connections for efficient neural networks. arXiv preprint arXiv:1506.02626 (2015)
Hassibi, B., Stork, D.G.: Second order derivatives for network pruning: Optimal brain surgeon. Morgan Kaufmann (1993)
Google Scholar
He, Y., Liu, P., Wang, Z., Hu, Z., Yang, Y.: Filter pruning via geometric median for deep convolutional neural networks acceleration. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4340–4349 (2019)
Google Scholar
Huang, Z., Wang, J., Fu, X., Yu, T., Guo, Y., Wang, R.: Dc-spp-yolo: dense connection and spatial pyramid pooling based yolo for object detection. Inf. Sci. 522, 241–258 (2020)
Article MathSciNet Google Scholar
Jocher, G.: ultralytics/yolov5. https://github.com/ultralytics/yolov5
Kwon, S.J., Lee, D., Kim, B., Kapoor, P., Park, B., Wei, G.Y.: Structured compression by weight encryption for unstructured pruning and quantization. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1909–1918 (2020)
Google Scholar
Li, Y., Gu, S., Mayer, C., Gool, L.V., Timofte, R.: Group sparsity: the hinge between filter pruning and decomposition for network compression. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8018–8027 (2020)
Google Scholar
Lin, M., et al.: Hrank: filter pruning using high-rank feature map. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1529–1538 (2020)
Google Scholar
Lin, T.Y., Goyal, P., Girshick, R., He, K., Dollár, P.: Focal loss for dense object detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2980–2988 (2017)
Google Scholar
Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_48
Chapter Google Scholar
Liu, S., Huang, D., Wang, Y.: Learning spatial fusion for single-shot object detection. arXiv preprint arXiv:1911.09516 (2019)
Liu, Y., Chen, K., Liu, C., Qin, Z., Luo, Z., Wang, J.: Structured knowledge distillation for semantic segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2604–2613 (2019)
Google Scholar
Liu, Z., Li, J., Shen, Z., Huang, G., Yan, S., Zhang, C.: Learning efficient convolutional networks through network slimming. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2736–2744 (2017)
Google Scholar
Long, X., et al.: Pp-yolo: an effective and efficient implementation of object detector. arXiv preprint arXiv:2007.12099 (2020)
Qin, Z., Zhang, P., Wu, F., Li, X.: Fcanet: frequency channel attention networks. arXiv preprint arXiv:2012.11879 (2020)
Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 779–788 (2016)
Google Scholar
Redmon, J., Farhadi, A.: Yolo9000: better, faster, stronger. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7263–7271 (2017)
Google Scholar
Redmon, J., Farhadi, A.: Yolov3: An incremental improvement. arXiv preprint arXiv:1804.02767 (2018)
Tan, M., Pang, R., Le, Q.V.: Efficientdet: Scalable and efficient object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10781–10790 (2020)
Google Scholar
Tung, F., Mori, G.: Clip-q: Deep network compression learning by in-parallel pruning-quantization. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7873–7882 (2018)
Google Scholar
Wang, C.Y., Bochkovskiy, A., Liao, H.Y.M.: Scaled-yolov4: Scaling cross stage partial network. arXiv preprint arXiv:2011.08036 (2020)
Wang, Z., Zhang, J., Zhao, Z., Su, F.: Efficient yolo: A lightweight model for embedded deep learning object detection. In: 2020 IEEE International Conference on Multimedia & Expo Workshops (ICMEW). pp. 1–6. IEEE (2020)
Google Scholar
Wu, Y., Kirillov, A., Massa, F., Lo, W.Y., Girshick, R.: Detectron2 (2019). https://github.com/facebookresearch/detectron2
Ye, J., Lu, X., Lin, Z., Wang, J.Z.: Rethinking the smaller-norm-less-informative assumption in channel pruning of convolution layers. arXiv preprint arXiv:1802.00124 (2018)
Zhang, J., Zhao, Z., Su, F.: Efficient-receptive field block with group spatial attention mechanism for object detection. In: 2020 25th International Conference on Pattern Recognition (ICPR), pp. 3248–3255 (2021)
Google Scholar
Zhang, P., Zhong, Y., Li, X.: Slimyolov3: narrower, faster and better for real-time uav applications. In: Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops, pp. 0–0 (2019)
Google Scholar
Zhang, S., Chi, C., Yao, Y., Lei, Z., Li, S.Z.: Bridging the gap between anchor-based and anchor-free detection via adaptive training sample selection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9759–9768 (2020)
Google Scholar
Zhi Tian, Chunhua Shen, H.C., He, T.: FCOS: fully convolutional one-stage object detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 9627–9636 (2019)
Google Scholar
Zhu, P., Wen, L., Bian, X., Ling, H., Hu, Q.: Vision meets drones: A challenge. arXiv preprint arXiv:1804.07437 (2018)
Zhu, X., Su, W., Lu, L., Li, B., Wang, X., Dai, J.: Deformable detr: deformable transformers for end-to-end object detection. arXiv preprint arXiv:2010.04159 (2020)

Download references

Author information

Authors and Affiliations

School of Artificial Intelligence, Beijing University of Posts and Telecommunications, Beijing, China
Jiacheng Zhang, Pingyu Wang, Zhicheng Zhao & Fei Su
Beijing Key Laboratory of Network System and Network Culture, Beijing, China
Jiacheng Zhang, Pingyu Wang, Zhicheng Zhao & Fei Su

Authors

Jiacheng Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Pingyu Wang
View author publications
You can also search for this author in PubMed Google Scholar
Zhicheng Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Fei Su
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zhicheng Zhao .

Editor information

Editors and Affiliations

Comenius University in Bratislava, Bratislava, Slovakia
Igor Farkaš
iMotions A/S, Copenhagen, Denmark
Paolo Masulli
University of Tübingen, Tübingen, Baden-Württemberg, Germany
Sebastian Otte
Universität Hamburg, Hamburg, Germany
Stefan Wermter

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhang, J., Wang, P., Zhao, Z., Su, F. (2021). Pruned-YOLO: Learning Efficient Object Detector Using Model Pruning. In: Farkaš, I., Masulli, P., Otte, S., Wermter, S. (eds) Artificial Neural Networks and Machine Learning – ICANN 2021. ICANN 2021. Lecture Notes in Computer Science(), vol 12894. Springer, Cham. https://doi.org/10.1007/978-3-030-86380-7_4

Download citation

DOI: https://doi.org/10.1007/978-3-030-86380-7_4
Published: 07 September 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-86379-1
Online ISBN: 978-3-030-86380-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics