Abstract
Object detection for aerial remote sensing images is a foundation task in earth observation community. However, various challenges still exist in this field, including the varied appearances of targets to be detected, the complexity of image background and the expensive manual annotation. To tackle these problems, we proposed a Faster R-CNN based framework with several elaborate designs. Our detector employs a bidirectional enhancement feature pyramid network into the framework, which can improve multi-scale feature extraction so as to effectively handle objects with different sizes. In the meantime, an attention module is present to further suppress noisy background. Moreover, we augment training sets by using a count-guided deep descriptor transforming (CG-DDT) algorithm, which can automatically generate coarse object bounding boxes for images with only class label and per-class object count. We have evaluated the proposed method on popular aerial remote sensing benchmarks, i.e., NWPU VHR-10 and DOTA, and the experimental results show that it can accurately detect targets while reducing the cost of manual annotations during training.
Similar content being viewed by others
References
Deng J, Dong W, Socher R, Li L-J, Li K, Fei-Fei L (2009) Imagenet: A Large-Scale Hierarchical Image Database. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Miami Beach, USA, 20-25 June
Lin TY, Maire M, Belongie S, Hays J, Zitnick CL (2014) Microsoft COCO:common objects in context, In Proceedings of the 13th European Conference on Computer Vision, Zurich, Switzerland, 6-12 Septemebr
Ren S, He K, Girshick R, Sun J (2017) Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE Transactions on Pattern Analysis and Machine Intelligence 39:1137–1149
Cheng X, Liu L, Song C (2021) A Cyclic Information–Interaction Model for Remote Sensing Image Segmentation. Remote Sens 13:3871
Han X; Zhong Y; Zhang L (2017) An Efficient and Robust Integrated Geospatial Object Detection Framework for High Spatial Resolution Remote Sensing Imagery. Remote Sens 9(7), Article No.666
Yun R, Zhu C, Xiao S (2018) Deformable Faster R-CNN with Aggregating Multi-Layer Features for Partially Occluded Object Detection in Optical Remote Sensing Images. Remote Sens 10(9), Article No.1470
Li K, Cheng G, Bu S, You X (2018) Rotation-insensitive and context-augmented object detection in remote sensing images. IEEE T Geosci Remote 56:2337–2348
Guo W, Yang W, Zhang H, Hua G (2018) Geospatial object detection in high resolution satellite images based on multi-scale convolutional neural network. REMOTE SENS 10, Article No.131
Yan J, Wang H, Yan M, Diao W, Sun X, Li H (2019) IoUadaptive deformable R-CNN: Make full use of IoU for multi-class object detection in remote sensing imagery. REMOTE SENS 11, Article No.286
Qiu H; Li H; Wu Q; Meng F; Shi H (2019) A2RMNet: Adaptively Aspect Ratio Multi-Scale Network for Object Detection in Remote Sensing Images. Remote Sens, 11, Article No.1594
Zhang X; Zhu K; Chen G; Tan X; Gong Y (2019) Geospatial object detection on high resolution remote sensing imagery based on double multi-scale feature pyramid Network. Remote Sens 11(7), Article No. 755
Hou J; Ma H; Wang S (2020) Parallel Cascade R-CNN for Object Detection in Remote Sensing Imagery. JPCS 1544, Article No. 012124
Xu ZZ; Xu X; Wang L; Yang R; Pu F L (2017) Deformable ConvNet with aspect ratio constrained NMS for object detection in remote sensing imagery. Remote Sens 9(12), Article No. 1312.
Azimi SM; Vig E; Bahmanyar R; Korner M; Reinartz P (2018) Towards multi-class object detection in unconstrained remote sensing imagery. In Proceedings of the 14th Asian Conference on Computer Vision, Perth, Australia, 2-6 December
Wang JW; Ding J; Guo HW; Cheng W S; Pan T; Yang W (2019) Mask OBB: A Semantic Attention-based mask oriented bounding box representation for multi-category object detection in aerial images. Remote Sens, 11(24), Article No. 2930
Xu CY, Li CZ, Cui Z, Zhang T, Yang J (2020) Hierarchical semantic propagation for object detection in remote sensing imagery. IEEE T Geosci Remote 58(6):4353–4364
Chen SQ; Zhan RH; Zhang J (2018) Geospatial object detection in remote sensing imagery based on multiscale single-shot detector with activated semantics. Remote Sens, 10(6), Article No. 820
Fu K; Chen Z; Zhang Y; Sun X (2019) Enhanced feature representation in detection for optical remote sensing images. Remote Sens, 11(18), Article No. 2095
Sun P, Chen G, Shang Y (2020) Adaptive saliency biased loss for object detection in aerial images. IEEE Geosci Remote 58(10):7154–7165
Wang PJ, Sun X, Diao WH, Fu K (2020) FMSSD: Feature-merged single-shot detection for multiscale objects in large-scale remote sensing imagery. IEEE T Geosci Remote 58(5):3377–3390
Xu K; Ba J; Kiros R; Cho K; Courville A; Salakhudinov R; Zemel R; Bengio Y (2015) Show, attend and tell: Neural image caption generation with visual attention. In Proceedings of the International Conference on Machine Learning, Lille, France,6-11 July
Zhu F; Li H; Ouyang W; Yu N; Wang X (2017) Learning spatial regularization with image level supervisions for multi-label image classification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Hawaii, USA, 21-26 July
Chu X; Yang W; Ouyang W; Ma C; Yuille AL; Wang X (2017) Multi-context attention for human pose estimation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Hawaii, USA, 21-26 July
Hu J; Shen L; Sun G (2017) Squeeze-and-excitation networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Hawaii, USA, 21-26 July
Zhang GJ, Lu SJ, Zhang W (2019) CAD-Net: A context-aware detection network for objects in remote sensing imagery. IEEE Geosci Remote 57(12):10015–10024
Yang F; Li WT; Hu HW; Li WY; Wang P (2020) Multi-scale feature integrated attention-based rotation network for object detection in VHR aerial images. Sensors, 20(6), Article No.1686.
Li CZ; Xu CY; Cui Z; Wang D; Zhang T; Yang J (2019) Featureattentioned object detection in remote sensing imagery. In Proceedings of the IEEE International Conference on Image Processing, Taipei, China,22-25 September
Yang X; Yang JR; Yan JC; Zhang Y; Zhang TF; Guo Z (2019) SCRDet: Towards more robust detection for small, cluttered and rotated objects. In Proceedings of the IEEE International Conference on Computer Vision, Seoul, Korea, 20-26 October
Yang X; Yan J C; Yang X K; Tang J; Liao W L; He T. SCRDet ++: Detecting small, cluttered and rotated objects via instance-level feature denoising and rotation loss smoothing. arXiv preprint arXiv 2020, 2004.13316.
Chen S, Shao D, Shu X, Zhang C, Wang J (2020) FCC-Net: A Full-Coverage Collaborative Network for Weakly Supervised Remote Sensing Object Detection. Electronics 9:1356
Wei Z, Wenping M, Licheng J, Puhua C, Shuyuan Y, Biao H (2019) Multi-Scale Image Block-Level F-CNN for Remote Sensing Images Object Detection. IEEE Access 7:43607–43621
Lin TY; Dollar P; Girshick R; He K; Hariharan B; Belongie S (2017) Feature Pyramid Networks for Object Detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Hawaii, USA,21-26 July
Yang K; Li D; Dou Y (2019) Towards precise end-to-end weakly supervised object detection network. In Proceedings of the IEEE International Conference on Computer Vision, Seoul, Korea, 20-26 October
Li K, Wan G, Cheng G, Meng L, Han J (2020) Object Detection in Optical Remote Sensing Images A Survey and A New Benchmark. ISPRS Journal of Photogrammetry and Remote Sensing 15:296–307
Wei X-S, Zhang C-L, Wu J, Shen C, Zhou Z-H (2019) Unsupervised Object Discovery and Co-Localization by Deep Descriptor Transforming. Pattern Recogn 88:113–126
Cheng G, Han JW, Zhou PC, Guo L (2014) Multi-class geospatial object detection and geographic image classification based on collection of part detectors. ISPRS Journal of Photogrammetry and Remote Sensing 98:119–132
Xia GS; Bai X; Ding J; Zhu Z; Belongie S; Luo JB (2018) DOTA: A large-scale dataset for object detection in aerial images. In proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA,18-23 June
Wei Y; Shen Z; Cheng B; Shi H; Xiong J; Feng J; Huang T (2018) Ts2c: Tight box mining with surrounding segmentation context for weakly supervised object detection. In Proceedings of the European Conference on Computer Vision, Germany, Munich, 9-14 September
Ding K; He G; Gu H; Zhong Z; Xiang S; Pan C (2022) Train in Dense and Test in Sparse: A Method for Sparse Object Detection in Aerial Images. IEEE Geoscience and Remote Sensing Letters 19, PP(99):1-5
Wang W; Xie E; Song X; Zang Y; Wang W; Lu T; Yu G; Shen C (2019) Efficient and Accurate Arbitrary-Shaped Text Detection with Pixel Aggregation Network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
Yan D, Li G, Li X, Zhang H, Lei H, Lu K, Cheng M, Zhu F (2021) An Improved Faster R-CNN Method to Detect Tailings Ponds from High-Resolution Remote Sensing Images. Remote Sens 13:2052
He K; Gkioxari G; Dollár P; Girshick R (2017) Mask R-CNN. In Proceedings of the IEEE International Conference on Computer Vision
Acknowledgments
The authors would like to thank all reviewers and editors for their constructive comments for this study.
Funding
This work was supported by the National Natural Science Foundation of China (62172229), the Natural Science Foundation of Jiangsu Province (BK20211294, BK20211295), and the Postgraduate Research & Practice Innovation Program of Jiangsu Province (SJCX22_0996).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflicts of Interest
The authors declare no conflict of interest.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Yang, P., Yu, D. & Yang, G. Object detection in aerial remote sensing images using bidirectional enhancement FPN and attention module with data augmentation. Multimed Tools Appl 83, 38635–38656 (2024). https://doi.org/10.1007/s11042-023-16973-8
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-023-16973-8