Small object detection via dual inspection mechanism for UAV visual images

Tian, Gangyi; Liu, Jianran; Zhao, Hong; Yang, Wenyuan

doi:10.1007/s10489-021-02512-1

Small object detection via dual inspection mechanism for UAV visual images

Published: 20 July 2021

Volume 52, pages 4244–4257, (2022)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Gangyi Tian^1,2,
Jianran Liu²,
Hong Zhao^1,2 &
…
Wenyuan Yang ORCID: orcid.org/0000-0002-8372-7314^1,2

1970 Accesses
Explore all metrics

Abstract

Unmanned Aerial Vehicles (UAVs) are utilized instead of humans to complete aerial assignments in various fields. With the development of computer vision, object detection has become one of the core technologies in UAV application. However, object detection of small targets often has missed detection, and the detection performance is far less than that of large targets. In this paper, we propose a dual inspection mechanism, which identifies missed targets in suspicious areas to assist single-stage detection branches, and shares dual decisions to make feature-level multi-instance detection modules produce reliable results. Firstly, the detection results contain missed targets is confirmed, which are in the part that does not reach the confidence threshold. For this reason, the feature vector provided by the denoising sparse autoencoder is calculated, and this part of the result is filtered again. Secondly, we empirically reveal that single detection results are not reliable enough, and the multiple attributes of the target need to be considered. Motivated by this, the initial and secondary detection results are combined and rank by importance. Finally, we give the corresponding confidence to the top-ranked instance, making it possible to become the object again. Experimental results reflect that our mechanism improves 2.7% mAP on the VisDrone2020 dataset, 1.0% mAP on the UAVDT dataset and 1.8% mAP on the MS COCO dataset. We propose detection mechanism which achieves state-of-the-art levels on these datasets and it performs better on small object detection.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Joint-YODNet: A Light-Weight Object Detector for UAVs to Achieve Above 100fps

An Upgraded-YOLO with Object Augmentation: Mini-UAV Detection Under Low-Visibility Conditions by Improving Deep Neural Networks

Article 30 September 2022

ODD-YOLOv8: an algorithm for small object detection in UAV imagery

Article 23 November 2024

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

References

Sun G, Ding S, Sun T, Zhang C (2021) Sa-capsgan: Using capsule networks with embedded self-attention for generative adversarial network. Neurocomputing 423:399–406
Article Google Scholar
Hsieh M-R, Lin Y-L, Hsu HW (2017) Drone-based object counting by spatially regularized regional proposal network. . In: IEEE International Conference on Computer Vision, pp 4165–4173
Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: IEEE Conference on Computer Vision and Pattern Recognition, pp 1–8
Girshick R, Donahue J, Darrell T, Malik J (2015) Region-based convolutional networks for accurate object detection and segmentation. IEEE Trans Pattern Anal Mach Intell 38:142–158
Article Google Scholar
Dai J, Li Y, He K, Sun J (2016) R-fcn: Object detection via region-based fully convolutional networks. In: IEEE Conference on Computer Vision and Pattern Recognition, pp 379–387
Xiao T, Li S, Wang B, Lin L, Wang X (2016) End-to-end deep learning for person search. In: IEEE Conference on Computer Vision and Pattern Recognition
Liu W, Anguelov D, Erhan D, Szegedy C, Reed S (2016) Ssd: Single shot multibox detector. In: European Conference on Computer Vision, pp 1–17
Leng J, Liu Y (2018) An enhanced ssd with feature fusion and visual reasoning for object detection. Neural Comput Appl 13:1–10
Google Scholar
Jeong J, Park H, Kwak N (2017) Enhancement of ssd by concatenating feature maps for object detection. In: IEEE Conference on Computer Vision and Pattern Recognition, pp 1–12
Redmon J, Divvala S, Girshick R, Farhadi A (2015) You only look once: Unified, real-time object detection. In: IEEE Conference on Computer Vision and Pattern Recognition, pp 779–788
Redmon J, Farhadi A (2017) Yolo9000: Better, faster, stronger. In: IEEE Conference on Computer Vision and Pattern Recognition, pp 6517–6525
Redmon J, Farhadi A (2018) Yolov3: An incremental improvement. arXiv:1804.02767
Bochkovskiy A, Wang C-Y, Liao H-Y (2020) Yolov4: Optimal speed and accuracy of object detection. pp 1–17. arXiv:1911.09070v4
Tan M, Le Q (2019) Efficientnet: Rethinking model scaling for convolutional neural networks. In: International conference on machine learning, pp 1–10
Tan M, Pang R, Le Q (2020) Efficientdet: Scalable and efficient object detection. In: Ieee conference on computer vision and pattern recognition, pp 10781–10790
Lei J, Chen Y, Bo P, Ling N, Hou C (2018) Multi-stream region proposal network for pedestrian detection. In: IEEE International Conference on Multimedia and Expo Workshops , pp 1–6
Cai Z, Fan Q, Feris R, Vasconcelos N (2016) A unified multi-scale deep convolutional neural network for fast object detection. In: European Conference on Computer Vision, pp 354–370
Lin T-Y, Goyal P, Girshick R, He K, Dollar P (2018) Focal loss for dense object detection. In: IEEE Transactions on Pattern Analysis and Machine Intelligence, vol 42, pp 318–327
Bayar B, Stamm M (2018) Constrained convolutional neural networks: A new approach towards general purpose image manipulation detection. IEEE Trans Inf Forensic Secur 13:2691–2706
Article Google Scholar
Li T, Ding F, Yang W (2020) Uav object tracking by background cues and aberrances response suppression mechanism. Neural Comput Appl:1–15
Uysal M, Toprak AS, Polat N (2015) Dem generation with uav photogrammetry and accuracy analysis in sahitler hill. Measurement 73(9):539–543
Article Google Scholar
Ge W, Yang S, Yu Y (2018) Multi-evidence filtering and fusion for multi-label classification, object detection and semantic segmentation based on weakly supervised learning. In: IEEE Conference on Computer Vision and Pattern Recognition, pp 1277–1286
Chen X, Ma H, Wan J, Li B, Xia T (2017) Multi-view 3d object detection network for autonomous driving. In: IEEE Conference on Computer Vision and Pattern Recognition, pp 6526–6534
Conte G, Doherty P (2008) An integrated uav navigation system based on aerial image matching. In: IEEE Aerospace Conference Proceedings, pp 1–10
Laliberte A, Rango A (2009) Texture and scale in object-based analysis of subdecimeter resolution unmanned aerial vehicle (uav) imagery. IEEE Trans Geosci Remote Sens 47:761–770
Article Google Scholar
Lu Y, Xue Z, Xia G-S, Zhang L (2018) A survey on vision-based uav navigation. Geo-spatial Inf Sci 21:1–12
Article Google Scholar
Lin T-Y, Dollar P, Girshick R, He K, Hariharan B, Belongie S (2017) Feature pyramid networks for object detection. In: IEEE Conference on Computer Vision and Pattern Recognition, pp 936–944
Peterson L (2009) K-nearest neighbor. Scholarpedia 4:1883
Article Google Scholar
Kong T, Sun F, Huang W, Liu H (2018) Deep feature pyramid reconfiguration for object detection. In: European Conference on Computer Vision, pp 8–14
Cai Z, Vasconcelos N (2018) Cascade r-cnn: Delving into high quality object detection. In: IEEE Conference on Computer Vision and Pattern Recognition, pp 6154–6162
He K, Zhang X, Ren S, Sun J (2015) Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans Pattern Anal Mach Intell 37:1904–1920
Article Google Scholar
Girshick R (2015) Fast r-cnn. In: IEEE international conference on computer vision, pp 1440–1448
Ren S, He K, Girshick R, Sun J (2017) Faster r-cnn: Towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell 39:1137–1149
Article Google Scholar
Ding X, Li Q, Cheng Y, Wang J, Bian W, Jie B (2020) Local keypoint-based faster r-cnn. Appl Intell 50:3007–3022
Article Google Scholar
Mao Q-C, Sun H-M, Zuo L-Q, Jia R-S (2020) Finding every car: A traffic surveillance multi-scale vehicle object detection method. Appl Intell 50:3125–3136
Article Google Scholar
Dai X, Yuan X, Wei X (2020) Tirnet: Object detection in thermal infrared images for autonomous driving. Appl Intell:1–10
Ren Y, Zhu C, Xiao S (2018) Small object detection in optical remote sensing images via modified faster r-cnn. Appl Sci 2:1–11
Google Scholar
Yi K, Jian Z, Chen S, Chen Y, Zheng N (2018) Knowledge-based recurrent attentive neural network for traffic sign detection 4:15–18. arXiv:1803.05263
Liu S, Qi L, Qin H, Shi J, Jia J (2018) Path aggregation network for instance segmentation. In: IEEE Conference on Computer Vision and Pattern Recognition, pp 8759–8768
Li Y, Chen Y, Wang N, Zhang Z-X (2019) Scale-aware trident networks for object detection. In: IEEE International Conference on Computer Vision, pp 6053–6062
Tan M, Pang R, Le Q (2020) Efficientdet: Scalable and efficient object detection. In: IEEE Conference on Computer Vision and Pattern Recognition, pp 10781–10790
Liu Z, Gao G, Sun L, Fang Z (2020) Hrdnet: High-resolution detection network for small objects, pp 1–8. arXiv:2006.07607
Ding S, Zhang N, Zhang J, Xu X, Shi Z (2017) Unsupervised extreme learning machine with representational features. Int J Mach Learn Cybern 8:587–595
Article Google Scholar
Zhang J, Ding S, Zhang N, Shi Z (2016) Incremental extreme learning machine based on deep feature embedded. Int J Mach Learn Cybern 7:111–120
Article Google Scholar
Meng L, Ding S, Xue Y (2016) Research on denoising sparse autoencoder. Int J Mach Learn Cybern 8:1719–1729
Article Google Scholar
Zhu P, Wen L, Du D, Bian X, Hu Q, Ling H (2020) Vision meets drones: Past, present and future, pp 1–11. arXiv:2001.06303
Du D, Qi Y, Yu H, Yang Y, Duan K, Li G, Zhang W, Tian Q (2018) The unmanned aerial vehicle benchmark: Object detection and tracking, pp 1–17
Lin T-Y, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollr P, Zitnick C (2014) Microsoft coco: Common objects in context. In: IEEE International Conference on Computer Vision, vol 8693, pp 740–755
Fu C-Y, Liu W, Ranga A, Tyagi A, Berg A C (2017) Dssd: Deconvolutional single shot detector. In: IEEE Conference on Computer Vision and Pattern Recognition, pp 1–11
Yang F, Fan H, Chu P, Blasch E, Ling H (2019) Clustered object detection in aerial images. In: IEEE International Conference on Computer Vision, pp 1–10
Singh B, Najibi M, Davis L (2018) Sniper: Efficient multi-scale training. In: Conference on Neural Information Processing Systems, pp 1–11
Singh B, Davis L (2018) An analysis of scale invariance in object detection-snip. In: IEEE Conference on Computer Vision and Pattern Recognition, pp 1–10
Zhang S, Wen L, Bian X, Lei Z, Li S (2020) Refinedet++: Single-shot refinement neural network for object detection. IEEE Trans Circ Sys Video Technol:1–10
Liu S, Huang D, Wang Y (2018) Receptive field block net for accurate and fast object detection. In: European Conference on Computer Vision, pp 404–419
Kim S-W, Kook H-K, Sun J-Y, Kang M-C, Ko S-J (2018) Parallel feature pyramid network for object detection. In: European Conference on Computer Vision, pp 234–250
Wang T, Anwer R M, Cholakkal H, Khan F S, Pang Y, Shao L (2019) Learning rich features at high-speed for single-shot object detection. In: IEEE International Conference on Computer Vision, pp 1971–1980
Law H, Deng J (2018) Cornernet: Detecting objects as paired keypoints. In: European Conference on Computer Vision, pp 734–750
Zhao Q, Sheng T, Wang Y, Tang Z, Chen Y, Cai L, Ling H (2019) M2det: A single-shot object detector based on multi-level feature pyramid network. Proc AAAI Conf Artif Intell 33:9259–9266
Google Scholar
Tian Z, Shen C, Chen H, He T (2019) Fcos: Fully convolutional one-stage object detectionv
Zhu C, He Y, Savvides M (2019) Feature selective anchor-free module for single-shot object detection. In: IEEE Conference on Computer Vision and Pattern Recognition, pp 840–849
Liu S, Huang D, Wang Y (2019) Learning spatial fusion for single-shot object detection, pp 1–8. arXiv:1911.09516
Duan K, Bai S, Xie L, Qi H, Tian Q (2019) Centernet: Object detection with keypoint triplets for object detection. In: IEEE International Conference on Computer Vision, pp 6569–6578
Zhu C, Chen F, Shen Z, Savvides M (2019) Soft anchor-point object detection, pp 1–9. arXiv:1911.12448
Zhang S, Chi C, Yao Y, Lei Z, Li S (2020) Bridging the gap between anchor-based and anchor-free detection via adaptive training sample selection. In: IEEE Conference on Computer Vision and Pattern Recognition, pp 1–10
Dai J, Qi H, Xiong Y, Li Y, Zhang G, Hu H, Wei Y (2017) Deformable convolutional networks. In: IEEE International Conference on Computer Vision, pp 764–773

Download references

Acknowledgments

This work was supported by the National Natural Science Foundation of China under Grant No. 61703196, the Natural Science Foundation of Fujian Province under Grant 2020J01821 and the Key Science Foundation of Zhangzhou City under Grant ZZ2019ZD11.

Author information

Authors and Affiliations

School of Computer Science, Minnan Normal University, Zhangzhou, 363000, China
Gangyi Tian, Hong Zhao & Wenyuan Yang
Fujian Key Laboratory of Granular Computing and Application, Minnan Normal University, Zhangzhou, 363000, China
Gangyi Tian, Jianran Liu, Hong Zhao & Wenyuan Yang

Authors

Gangyi Tian
View author publications
You can also search for this author inPubMed Google Scholar
Jianran Liu
View author publications
You can also search for this author inPubMed Google Scholar
Hong Zhao
View author publications
You can also search for this author inPubMed Google Scholar
Wenyuan Yang
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Wenyuan Yang.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Tian, G., Liu, J., Zhao, H. et al. Small object detection via dual inspection mechanism for UAV visual images. Appl Intell 52, 4244–4257 (2022). https://doi.org/10.1007/s10489-021-02512-1

Download citation

Accepted: 05 May 2021
Published: 20 July 2021
Issue Date: March 2022
DOI: https://doi.org/10.1007/s10489-021-02512-1

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Small object detection via dual inspection mechanism for UAV visual images

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Joint-YODNet: A Light-Weight Object Detector for UAVs to Achieve Above 100fps

An Upgraded-YOLO with Object Augmentation: Mini-UAV Detection Under Low-Visibility Conditions by Improving Deep Neural Networks

ODD-YOLOv8: an algorithm for small object detection in UAV imagery

Explore related subjects

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now