research-article

Multi-Scale Deformable Context Fusion Pyramid Network for Object Detection in Aerial Images

Authors:

Tianyang ZhangAuthors Info & Claims

ICRAI '22: Proceedings of the 8th International Conference on Robotics and Artificial Intelligence

Pages 32 - 38

https://doi.org/10.1145/3573910.3573919

Published: 20 January 2023 Publication History

Abstract

In recent years, although many methods have made good progress in the object detection of aerial images, it is still difficult to improve the detection accuracy due to the large scale variation, exceptional aspect ratios and a large number of small and dense objects in aerial images. In this paper, we propose a Multi-Scale Deformable Context Fusion Pyramid Network (Multi-Scale DCFPN) for the object detection of aerial images. In the proposed Multi-Scale DCFPN, we design a Multi-Scale Deformable Context Fusion (MDCF) module, which injects context information into Feature Pyramid Network (FPN) in a residual manner. In particularly, MDCF module consists of a set of parallel Adaptive Deformable Context (ADC) module with special dilated rates and Adaptive Context Fusion (ACF) module. ADC module focus on the objects with exceptional aspect ratios by introducing a deformable paradigm, and ACF module adaptively combines the multi-scale context instead of summation. To verify the effectiveness of the proposed method, we execute some comparative experiments on aerial public datasets NWPU VHR-10 and DIOR, and the mAP increased by 3.5 and 1.0 points compared with the baseline, respectively. The experimental results also show that the proposed method outperforms the state-of-the-art methods.

References

[1]

Li Liu, Wanli Ouyang, Xiaogang Wang, Paul Fieguth, Jie Chen, Xinwang Liu, and Matti Pietik¨ainen. 2020. Deep learning for generic object detection: A survey. International journal of computer vision 128, 2 (2020), 261–318.

Digital Library

[2]

Gong Cheng and Junwei Han. 2016. A survey on object detection in optical remote sensing images. ISPRS Journal of Photogrammetry and Remote sensing 117 (2016), 11–28.

[3]

Ke Li, Gang Wan, Gong Cheng, Liqiu Meng, and Junwei Han. 2020. Object detection in optical remote sensing images: A survey and a new benchmark. ISPRS Journal of Photogrammetry and Remote Sensing 159 (2020), 296–307.

[4]

Jiaming Han, Jian Ding, Nan Xue, and Gui-Song Xia. 2021. Redet: A rotation-equivariant detector for aerial object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2786–2795.

[5]

Nicolas Carion, Francisco Massa, Gabriel Synnaeve, Nicolas Usunier, Alexander Kirillov, and Sergey Zagoruyko. 2020. End-to-end object detection with transformers. In European conference on computer vision. Springer, 213–229.

Digital Library

[6]

Peize Sun, Rufeng Zhang, Yi Jiang, Tao Kong, Chenfeng Xu, Wei Zhan, Masayoshi Tomizuka, Lei Li, Zehuan Yuan, Changhu Wang, 2021. Sparse r-cnn: End-to-end object detection with learnable proposals. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 14454–14463.

[7]

Tsung-Yi Lin, Piotr Doll´ar, Ross Girshick, Kaiming He, Bharath Hariharan, and Serge Belongie. 2017. Feature pyramid networks for object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition. 2117–2125.

[8]

Gong Cheng, Yongjie Si, Hailong Hong, Xiwen Yao, and Lei Guo. 2020. Cross-scale feature fusion for object detection in optical remote sensing images. IEEE Geoscience and Remote Sensing Letters 18, 3 (2020), 431–435.

[9]

Zhipeng Deng, Hao Sun, Shilin Zhou, Juanping Zhao, Lin Lei, and Huanxin Zou. 2018. Multi-scale object detection in remote sensing imagery with convolutional neural networks. ISPRS journal of photogrammetry and remote sensing 145 (2018), 3–22.

[10]

Xue Yang, Junchi Yan, Ziming Feng, and Tao He. 2021. R3det: Refined single-stage detector with feature refinement for rotating object. In Proceedings of the AAAI conference on artificial intelligence, Vol. 35. 3163–3171.

[11]

Gong Cheng, Peicheng Zhou, and Junwei Han. 2016. Learning rotation-invariant convolutional neural networks for object detection in VHR optical remote sensing images. IEEE Transactions on Geoscience and Remote Sensing 54, 12 (2016), 7405–7415.

[12]

Li, Ke, "Object detection in optical remote sensing images: A survey and a new benchmark." ISPRS Journal of Photogrammetry and Remote Sensing 159 (2020): 296-307.

[13]

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 770–778.

[14]

Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, and Andrew Rabinovich. 2015. Going deeper with convolutions. In Proceedings of the IEEE conference on computer vision and pattern recognition. 1–9.

[15]

Jifeng Dai, Haozhi Qi, Yuwen Xiong, Yi Li, Guodong Zhang, Han Hu, and Yichen Wei. 2017. Deformable convolutional networks. In Proceedings of the IEEE international conference on computer vision. 764–773.

[16]

Xizhou Zhu, Han Hu, Stephen Lin, and Jifeng Dai. 2019. Deformable convnets v2: More deformable, better results. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 9308–9316.

[17]

Chaoxu Guo, Bin Fan, Qian Zhang, Shiming Xiang, and Chunhong Pan. 2020. Augfpn: Improving multi-scale feature learning for object detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 12595–12604.

[18]

Mark Everingham, Luc Van Gool, Christopher KI Williams, John Winn, and Andrew Zisserman. 2010. The pascal visual object classes (voc) challenge. International journal of computer vision 88, 2 (2010), 303–338.

Digital Library

[19]

Kai Chen, Jiaqi Wang, Jiangmiao Pang, Yuhang Cao, Yu Xiong, Xiaoxiao Li, Shuyang Sun, Wansen Feng, Ziwei Liu, Jiarui Xu, 2019. MMDetection: Open mmlab detection toolbox and benchmark. arXiv preprint arXiv:1906.07155 (2019).

[20]

Joseph Redmon and Ali Farhadi. 2018. Yolov3: An incremental improvement. arXiv preprint arXiv:1804.02767 (2018).

[21]

Tsung-Yi Lin, Priya Goyal, Ross Girshick, Kaiming He, and Piotr Doll´ar. 2017. Focal loss for dense object detection. In Proceedings of the IEEE international conference on computer vision. 2980–2988.

[22]

Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. 2015. Faster r-cnn: Towards real-time object detection with region proposal networks. Advances in neural information processing systems 28 (2015).

[23]

Wei Liu, Dragomir Anguelov, Dumitru Erhan, Christian Szegedy, Scott Reed, Cheng-Yang Fu, and Alexander C Berg. 2016. Ssd: Single shot multibox detector. In European conference on computer vision. Springer, 21–37.

[24]

Zhaowei Cai and Nuno Vasconcelos. 2018. Cascade r-cnn: Delving into high quality object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition. 6154–6162.

[25]

Xiaolong Wang, Ross Girshick, Abhinav Gupta, and Kaiming He. 2018. Non-local neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition. 7794–7803.

[26]

Shu Liu, Lu Qi, Haifang Qin, Jianping Shi, and Jiaya Jia. 2018. Path aggregation network for instance segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition. 8759–8768.

[27]

Kai Chen, Jiangmiao Pang, Jiaqi Wang, Yu Xiong, Xiaoxiao Li, Shuyang Sun, Wansen Feng, Ziwei Liu, Jianping Shi, Wanli Ouyang, 2019a. Hybrid task cascade for instance segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 4974–4983.

[28]

Kang Kim and Hee Seok Lee. 2020. Probabilistic anchor assignment with iou prediction for object detection. In European Conference on Computer Vision. Springer, 355–371.

Digital Library

[29]

Zhaohui Zheng, Rongguang Ye, Qibin Hou, Dongwei Ren, Ping Wang, Wangmeng Zuo, and Ming-Ming Cheng. 2022. Localization distillation for object detection. arXiv preprint arXiv:2204.05957 (2022).

Index Terms

Multi-Scale Deformable Context Fusion Pyramid Network for Object Detection in Aerial Images
1. Computing methodologies
  1. Computer graphics
    1. Image manipulation
      1. Image processing

Recommendations

Orientation Robust Object Detection in Aerial Images Based on R-NMS
Abstract
Object detection in aerial images is a challenging task which plays an important role in many fields, such as intelligent traffic management, fishery management and so on. Different from object detection in natural images, the orientation of ...
Pyramid attention object detection network with multi-scale feature fusion
Highlights
- A multi-scale feature fusion pyramid attention module is proposed to better capture global and local features and improve the performance of object ...
Abstract
With the development of deep learning, object detection has made substantial progress. However, when the object to be detected in the image is small or partially occluded, the detection network often fails to detect it successfully. We ...
Graphical abstract

Display Omitted
Multi-scale Fusion based Multi-stage Small Object Detection in Aerial Images ∗
EITCE '22: Proceedings of the 2022 6th International Conference on Electronic Information Technology and Computer Engineering

In aerial images, the objects are mostly small. The number of objects is large and the scale is diverse, so it is difficult to extract the features of multiple scale objects at the same time. The location distribution of object in aerial images is ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

ICRAI '22: Proceedings of the 8th International Conference on Robotics and Artificial Intelligence

November 2022

89 pages

ISBN:9781450397544

DOI:10.1145/3573910

Copyright © 2022 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 20 January 2023

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

ICRAI 2022

ICRAI 2022: 2022 8th International Conference on Robotics and Artificial Intelligence

November 18 - 20, 2022

Singapore, Singapore

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
85
Total Downloads

Downloads (Last 12 months)18
Downloads (Last 6 weeks)5

Reflects downloads up to 25 Feb 2025

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Figures

Tables

Media

View Table of Conten