Skip to main content
Log in

A real-time small target detection network

  • Original Paper
  • Published:
Signal, Image and Video Processing Aims and scope Submit manuscript

Abstract

Target detection based on deep convolutional neural network has achieved excellent performance. However, small target detection is still one of the challenges in the field of computer vision. In this paper, we present an efficient network for real-time small target detection. The proposed network performs feature extraction using a modified Darknet53, while utilizing scale matching strategy to select suitable scales and anchor size for small target detection. In the network, we design an adaptive receptive field fusion module to increase the context information around the small targets by merging the features with different receptive field. Furthermore, we also propose an image cropping method in data preprocessing, aiming to make the targets trained in a wider range of scales. We conduct experiments on VEDAI dataset and small target dataset. Comparative results show that the proposed network achieved 74.5% mean average precision (mAP) at 50.0 FPS on VEDAI dataset and 45.7% mAP at 51.1 FPS on small target dataset which is better than other advanced target detectors.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  1. Han, J., Zhang, D., Cheng, G., Liu, N., Xu, D.: Advanced deep-learning techniques for salient and category-specific object detection: a survey. IEEE Signal Process. Mag. 35(1), 84–100 (2018)

    Article  Google Scholar 

  2. Li, Z., Chen, Z., Wu, Q.M.J., et al.: Real-time pedestrian detection with deep supervision in the wild. Signal Image Video Process. 13, 761–769 (2019)

    Article  Google Scholar 

  3. Nguyen-Meidine, L. T., Granger, E., Kiran, M. Blais-Morin, L. A.: A comparison of CNN-based face and head detectors for real-time video surveillance applications. arXiv preprint https://arxiv.org/abs/1809.03336 (2018).

  4. Chen, X., Ma, H., Wan, J., Li, B., Xia, T. Multi-view 3D object detection network for autonomous driving. arXiv preprint https://arxiv.org/abs/1611.07759 (2016).

  5. Kwan, C., Chou, B., Yang, J., Yang, j., Rangamani, A. Etienne-Cummings, R. Target tracking and classification using compressive measurements of MWIR and LWIR coded aperture cameras. Journal of Signal and Information Processing. pp. 73–95, (2019).

  6. Kwan, C., Gribben, D., Tran, T. Multiple Human Objects Tracking and Classification Directly in Compressive Measurement Domain for Long Range Infrared Videos, IEEE Ubiquitous Computing, Electronics & Mobile Communication Conference. (2019)

  7. Kwan, C., Gribben, D., Chou, B., Budavari, B.: Real-Time and Deep Learning Based Vehicle Detection and Classification Using Pixel-Wise Code Exposure Measurements. Electronics 9(6), 1014 (2020)

    Article  Google Scholar 

  8. Lowe, D.: Distinctive image features from scale-invariant key points. Int. J. Comput. Vision 60(2), 91–110 (2004)

    Article  Google Scholar 

  9. Ojala, T., Pietik¨ainen, M., Maenp¨a¨a, T.: Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE TPAMI, 24(7), 971–987 (2002).

  10. Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 886–893 (2005).

  11. Krizhevsky, A., Sutskever, I., Hinton, G. E.: ImageNet classification with deep convolutional neural networks. in Proc. NIPS, 1097–1105, (2012).

  12. Girshick, R. Fast R-CNN. in Proc. IEEE Int. Conf. Comput. Vis. pp.1440–1448 (2015).

  13. Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: Towards real time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39(6), 1137–1149 (2017)

    Article  Google Scholar 

  14. Liu, W. et al.: SSD: Single shot multibox detector. in Computer Vision ECCV. pp. 21–37 (2016).

  15. Redmon, J., Farhadi, A.: YOLO 9000: Better, faster, stronger. in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. pp. 6517–6525 (2017).

  16. Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: Unified, real-time object detection. in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. pp.779–788 (2016).

  17. Redmon, J., Farhadi, A.: YOLOv3: An incremental improvement. arXiv preprint https://arxiv.org/abs/1804.02767 (2018).

  18. Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. pp. 580–587 (2014).

  19. He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask R-CNN. in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. pp. 1–1 (2018).

  20. Van, d S K E A., Uijlings, J R R., Gevers, T., et al.: Segmentation as selective search for object recognition, in Proceedings of the 2011 International Conference on Computer Vision, pp.1879–1886 (2011).

  21. Zhou, X., Wang, D., Philipp, K.: Objects as points. arXiv preprint https://arxiv.org/abs/1904.07850 (2019).

  22. Law, H., Deng, J.: Cornernet: detecting objects as paired key points. In: Proceedings of European Conference on Computer Vision, pp. 765–781 (2018).

  23. Fu, C.-Y., W. Liu, A., Ranga, A., Tyagi, A., Berg, C.: DSSD: Deconvolutional single shot detector. arXiv preprint https://arxiv.org/abs/1701.06659 (2017).

  24. Zhang, S., Wen, L., Bian, X., Lei, Z., Li, S. Z.: Single-shot refinement neural network for object detection, in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 4203–4212 (2018).

  25. Chen, C., Liu, M., Tuzel, O., Xiao, J.: “R-cnn for small object detection. Asian conference on computer vision (2017).

  26. Eggert, C., Zecha, D., Brehm, S., Lienhart, R.: Improving Small Object Proposals for Company Logo Detection. in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. pp. 167–174 (2017).

  27. Hu, P., Ramanan, D.: Finding tiny faces. arXiv preprint https://arxiv.org/abs/1612.04402 (2017).

  28. Krishna, H., Jawahar, C.V.: Improving small object detection, in Asian conference on pattern recognition (2017).

  29. Zhang, S., Zhu, X., Lei, Z., Shi, H., Wang, X., Li, S. Z.: S^3FD: Single shot scale-invariant face detector, in 2017 IEEE International Conference on Computer Vision, pp. 192–201 (2017).

  30. Yu, F., and Koltun, V.: Multi-scale context aggregation by dilated convolutions. [Online]. Available: https://arxiv.org/abs/1511.07122. (2015).

  31. Rezatofighi, H., Tsoi, N., Gwak, J., Sadeghian, A., Reid, I., Savarese, S.: Generalized intersection over union: a metric and a loss for bounding box regression, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 658–666 (2019)

  32. Lin, T., Goyal, P., Girshick, R., He, K., Dollar, P.: Focal loss for dense object detection, in IEEE Transactions on Pattern Analysis and Machine Intelligence, pp. 2999–3007 (2017).

  33. Razakarivony, S., Jurie, F.: Vehicle detection in aerial imagery: A small target detection benchmark. J. Vis. Commun. Image Represent. 34, 187–203 (2016)

    Article  Google Scholar 

  34. Loshchilov, I., Hutter, F.: SGDR: stochastic gradient descent with warm restarts. [Online]. Available: https://arxiv.org/abs/1608.03983, (2016).

  35. Terrail, J. O. du, Jurie, F.: Faster RER-CNN: Application to the detection of vehicles in aerial images. arXiv preprint https://arxiv.org/abs/1809.07628 (2018).

  36. Zhang, Z., Liu, Y., Liu, T., Lin, Z., Wang, S.: DAGN: A real-time UAV remote sensing image vehicle detection framework, in IEEE Geoscience and Remote Sensing Letters (2019).

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Haibo Luo.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ju, M., Luo, J., Liu, G. et al. A real-time small target detection network. SIViP 15, 1265–1273 (2021). https://doi.org/10.1007/s11760-021-01857-x

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11760-021-01857-x

Keywords

Navigation