Abstract
Object detection in aerial images is a challenging task due to various orientations of objects and the lack of discriminative features. Existing methods are usually in a dilemma between accuracy and speed. While one-stage anchor-free detectors inference more quickly than two-stage frameworks, their predictions are not as accurate as that of the opposite. This paper proposes a quick and accurate detector, Mask-guidEd Anchor-free Detector (MEAD). It can rapidly locate oriented objects in aerial images by means of per-pixel prediction. Furthermore, it embeds a cascade architecture to locate targets more precisely. To enhance feature discrimination, the mask-guided branch is employed to force features to attend the foreground regions. Comparative experiments are conducted on DOTA and HRSC2016 datasets. The results show that MEAD is better than current state-of-the-art anchor-free detectors, that is, mAP 74.33 on DOTA and 89.83 on HRSC2016.
Similar content being viewed by others
References
Yang X, Liu Q, Yan J, Li A (2019) R3det: Refined single-stage detector with feature refinement for rotating object. arXiv:1908.05612
Yang X, chi Yan J (2020) Arbitrary-oriented object detection with circular smooth label. arXiv:2003.05597
Qian W, Yang X, Peng S, Guo Y, Yan C (2019) Learning modulated loss for rotated object detection. arXiv:1911.08299
Ding J, Xue N, Long Y, Xia G, Lu Q (2019) Learning roi transformer for oriented object detection in aerial images. Proceedings of IEEE ICCV, pp 2844–2853
Lin Y, Pengming F, Jian G (2019) Ienet: Interacting embranchment one stage anchor free detector for orientation aerial object detection. arXiv:1912.00969
Ren S, He K, Girshick R, Sun J (2015) Faster R-CNN: towards real-time object detection with region proposal networks. In: Proceedings of NIPS, pp 91–99
Ma J, Shao W, Ye H, Wang L, Wang H, Zheng Y, Xue X (2018) Arbitrary-oriented scene text detection via rotation proposals. IEEE Trans Multimed 20:3111–3122
Liu Z, Hu J, Weng L, Yang Y (2017) Rotated region based cnn for ship detection. Proceedings of IEEE ICIP, pp 900–904
Lin T-Y, Goyal P, Girshick R, He K, Dollár P (2017) Focal loss for dense object detection. In: Proceedings of IEEE ICCV, pp 2999–3007
Tian Z, Shen C, Chen H, He T (2019) Fcos: Fully convolutional one-stage object detection. In: Proceedings of IEEE ICCV
Yang Z, Liu S, Hu H, Wang L, Lin S (2019) Reppoints: Point set representation for object detection. In: The IEEE International Conference on Computer Vision (ICCV)
Xia G-S, Bai X, Ding J, Zhu Z, Belongie S J, Luo J, Datcu M, Pelillo M, Zhang L (2018) Dota: A large-scale dataset for object detection in aerial images, pp 3974–3983
Liu Z, Yuan L, Weng L, Yang Y (2017) A high resolution optical satellite image dataset for ship recognition and some new baselines. In: ICPRAM
Azimi S M, Vig E, Bahmanyar R, Körner M, Reinartz P (2018) Towards multi-class object detection in unconstrained remote sensing imagery. In: ACCV
Zhang Z, Guo W, Zhu S, Yu W (2018) Toward arbitrary-oriented ship detection with rotated region proposal and discrimination networks. IEEE Geosci Remote Sens Lett 15:1745–1749
Ke W, Zhang T, Huang Z, Ye Q, Liu J, Huang D (2020) Multiple anchor learning for visual object detection, pp Proceedings of IEEE CVPR, pp 10203–10212
Lu X, Li B, Yue Y, Li Q, Yan J (2019) Grid r-cnn, pp Proceedings of IEEE CVPR
Cao J, Cholakkal H, Anwer R M, Khan F S, Pang Y, Shao L (2020) D2det: Towards high quality object detection and instance segmentation. In: Proceedings of IEEE CVPR, pp 11482–11491
Qiao S, Chen L-C, Yuille A (2020) Detectors: Detecting objects with recursive feature pyramid and switchable atrous convolution. arXiv:2006.02334
Huang L, Yang Y, Deng Y, Yu Y (2015) Densebox: Unifying landmark localization with end to end object detection. arXiv:1509.04874
Zhang S, Chi C, Yao Y, Lei Z, Li S Z (2020) Bridging the gap between anchor-based and anchor-free detection via adaptive training sample selection. In: Proceedings of IEEE CVPR
Dai J, Qi H, Xiong Y, Li Y, Zhang G, Hu H, Wei Y (2017) Deformable convolutional networks. In: Proceedings of IEEE ICCV, pp 764–773
Viola P, Jones M J (2001) Rapid object detection using a boosted cascade of simple features. In: Proceedings of IEEE CVPR, pp I–I
Li W, Wang Z, Yin B, Peng Q, Du Y, Xiao T, Yu G, Lu H, Wei Y, Sun J (2019) Rethinking on multi-stage networks for human pose estimation. arXiv:1901.00148
Pang J, Chen K, Shi J, Feng H, Ouyang W, Lin D (2019) Libra r-cnn: Towards balanced learning for object detection. In: Proceedings of IEEE CVPR
Cai Z, Vasconcelos N (2019) Cascade r-cnn: High quality object detection and instance segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence
Chen K, Pang J, Wang J, Xiong Y, Li X, Sun S, Feng W, Liu Z, Shi J, Ouyang W, Loy C C, Lin D (2019) Hybrid task cascade for instance segmentation, pp 4969–4978
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez A N, Kaiser L , Polosukhin I (2017) Attention is all you need. In: Advances in Neural Information Processing Systems, vol 30
Carion N, Massa F, Synnaeve G, Usunier N, Kirillov A, Zagoruyko S (2020) End-to-end object detection with transformers. arXiv:2005.12872
Hu J, Shen L, Albanie S, Sun G, Wu E (2020) Squeeze-and-excitation networks. IEEE Trans Pattern Anal Mach Intell 42(8):2011–2023
Park J, Woo S, Lee J-Y, Kweon I-S (2018) Bam: Bottleneck attention module. In: BMVC
Woo S, Park J, Lee J-Y, Kweon I S (2018) Cbam: Convolutional block attention module. In: Proceedings of the ECCV
Yang X, Yang J, Yan J, Zhang Y, Zhang T, Guo Z, Sun X, Fu K (2019) Scrdet: Towards more robust detection for small, cluttered and rotated objects, pp 8231–8240
Pang Y, Xie J-C, Khan M H, Anwer R, Khan F, Shao L (2019) Mask-guided attention network for occluded pedestrian detection, pp 4966–4974
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of IEEE CVPR, pp 770–778
Girshick R (2015) Fast R-CNN. In: Proceedings of IEEE ICCV, pp 1440–1448
He K, Gkioxari G, Dollár P, Girshick R (2017) Mask R-CNN. In: Proceedings of IEEE ICCV, pp 2980–2988
Chen K (2019) MMDetection: Open mmlab detection toolbox and benchmark. arXiv:1906.07155
Zhang H, Chang H, Ma B, Naiyan W, Chen X (2020) Dynamic r-cnn: Towards high quality object detection via dynamic training. arXiv:2004.06002
Redmon J, Farhadi A (2018) Yolov3: an incremental improvement. arXiv:1804.02767
Acknowledgements
The authors are thankful for the financial support from the National Key Research and Development Program of China (2018YFB1404400), and the National Natural Science Foundation of China (Grant No. 61906190, U1936206, 61976213 and 61976212).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
He, Z., Ren, Z., Yang, X. et al. MEAD: a Mask-guidEd Anchor-free Detector for oriented aerial object detection. Appl Intell 52, 4382–4397 (2022). https://doi.org/10.1007/s10489-021-02570-5
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-021-02570-5