SR-YOLO: Small Objects Detection Based on Super Resolution

Hou, Biao; Chen, Xiaoyu; Zhou, Shenxuan; Jiang, Heng; Wang, Hao

doi:10.1007/978-3-031-14903-0_38

Biao Hou¹⁸,
Xiaoyu Chen¹⁸,
Shenxuan Zhou¹⁸,
Heng Jiang¹⁸ &
…
Hao Wang¹⁸

Part of the book series: IFIP Advances in Information and Communication Technology ((IFIPAICT,volume 659))

Included in the following conference series:

International Conference on Intelligence Science

1036 Accesses
1 Citations

Abstract

Since the introduction of convolutional neural networks, object detection based on deep learning has made great progress and has been widely used in the industry. However, because the weak and small object contains too little information, the samples are rich in diversity, and there are different degrees of occlusion, the detection difficulty is too great, and the object detection has entered a bottleneck period of development. We firstly introduce a super-resolution network to solve the problem of small object pixel area being too small, and fuse the super-resolution generator with the object detection baseline model for collaborative training. In addition, in order to reinforce the weak feature of small objects, we design a convolution block based on the edge detection operator Sobel. Experiments show that proposed method achieves mAP50 improvement of 2.4% for all classes and 4.4% for the relatively weak pedestrian class on our dataset relative to the Yolov5s baseline model.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 99.00; Price excludes VAT (USA)

Softcover Book: USD 129.99; Price excludes VAT (USA)

Hardcover Book: USD 129.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Lin, T.-Y., et al.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_48
Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 779–788 (2016)
Google Scholar
Liu, W., et al.: SSD: single shot MultiBox detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 21–37. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_2
Lin, T.-Y., Goyal, P., Girshick, R., He, K., Dollár, P.: Focal loss for dense object detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2980–2988 (2017)
Google Scholar
Jocher, G., et al.: Ultralytics/yolov5: v6.0 - YOLOv5-P6 1280 Models, AWS, Supervisely and YouTube Integrations (2022)
Google Scholar
Yaeger, L., Lyon, R., Webb, B.: Effective training of a neural network character classifier for word recognition. Adv. Neural Inf. Process. Syst. 9 (1996)
Google Scholar
Simard, P.Y., Steinkraus, D., Platt, J.C., et al.: Best practices for convolutional neural networks applied to visual document analysis. In: ICDAR, vol. 3 (2003)
Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. Adv. Neural Inf. Process. Syst. 25 (2012)
Google Scholar
Wan, L., Zeiler, M., Zhang, S., Le Cun, Y., Fergus, R.: Regularization of neural networks using dropconnect. In: International Conference on Machine Learning, pp. 1058–1066. In: PMLR (2013)
Google Scholar
Girshick, R.: Fast R-CNN IEEE International Conference on Computer Vision, pp. 1440–1448. IEEE (2015)
Google Scholar
DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017)
Zhang, H., Cisse, M., Dauphin, Y.N., Lopez-Paz, D.: mixup: beyond empirical risk minimization. arXiv preprint arXiv:1710.09412 (2017)
Yun, S., Han, D., Oh, S.J., Chun, S., Choe, J., Yoo, Y.: Cutmix: regularization strategy to train strong classifiers with localizable features. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 6023–6032 (2019)
Google Scholar
Bochkovskiy, A., Wang, C.-Y., Mark Liao, H.-Y.: Yolov4: optimal speed and accuracy of object detection. arXiv preprint arXiv:2004.10934 (2020)
Goodfellow, I., et al.: Generative adversarial nets. Adv. Neural Inf. Process. Syst. 27 (2014)
Google Scholar
Dong, C., Loy, C.C., He, K., Tang, X.: Image super-resolution using deep convolutional networks. IEEE Trans. Pattern Anal. Mach. Intell. 38(2), 295–307 (2015)
Article Google Scholar
Ledig, C., et al.: Photo-realistic single image super-resolution using a generative adversarial network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4681–4690 (2017)
Google Scholar
Neubeck, A., Van Gool, L.: Efficient non-maximum suppression. In: 18th International Conference on Pattern Recognition (ICPR’06), vol. 3, pp. 850–855. IEEE (2006)
Google Scholar
Sobel, I.E.: Camera Models and Machine Perception. Stanford University (1970)
Google Scholar

Download references

Acknowledgements

This work was supported in part by the National Natural Science Foundation of China under Grant 62171347,61877066, 61771379,62001355,62101405; the Foundation for Innovative Research Groups of the National Natural Science Foundation of China under Grant 61621005; the Key Research and Development Program in Shaanxi Province of China under Grant 2019ZDLGY03-05 and 2021ZDLGY02-08; the Science and Technology Program in Xi’an of China under Grant XA2020-RGZNTJ-0021; 111 Project.

Author information

Authors and Affiliations

School of Artificial Intelligence, Xidian University, Xi’an, 710071, China
Biao Hou, Xiaoyu Chen, Shenxuan Zhou, Heng Jiang & Hao Wang

Authors

Biao Hou
View author publications
You can also search for this author in PubMed Google Scholar
Xiaoyu Chen
View author publications
You can also search for this author in PubMed Google Scholar
Shenxuan Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Heng Jiang
View author publications
You can also search for this author in PubMed Google Scholar
Hao Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xiaoyu Chen .

Editor information

Editors and Affiliations

Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China
Zhongzhi Shi
Department of Computer Science, University of Surrey, Guildford, UK
Yaochu Jin
College of Artificial Intelligence, Xidian University, Xi’an, China
Xiangrong Zhang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hou, B., Chen, X., Zhou, S., Jiang, H., Wang, H. (2022). SR-YOLO: Small Objects Detection Based on Super Resolution. In: Shi, Z., Jin, Y., Zhang, X. (eds) Intelligence Science IV. ICIS 2022. IFIP Advances in Information and Communication Technology, vol 659. Springer, Cham. https://doi.org/10.1007/978-3-031-14903-0_38

Download citation

DOI: https://doi.org/10.1007/978-3-031-14903-0_38
Published: 19 October 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-14902-3
Online ISBN: 978-3-031-14903-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The International Federation for Information Processing (opens in a new tab)

SR-YOLO: Small Objects Detection Based on Super Resolution