ABSTRACT
Vehicle detection in Unmanned Aerial Vehicle (UAV) images is a challenging task because there are many small objects in UAV images, and the scale of objects varies greatly, which brings great difficulty to vehicle detection using existing algorithms. This paper proposes an anchor-free detector called Residual Feature Enhancement Pyramid Network (RFEPNet) for UAV vehicle detection. RFEPNet contains a Cross-Level Context Fusion Network (CLCFNet) and a Residual Feature Enhancement Module (RFEM) based on pyramid convolution. Specifically, CLCFNet utilizes the densely connected structure and Dual Attention Fusion Module (DAFM) to increase the sensitivity of high-resolution feature maps to small objects. Simultaneously, RFEM exploits pyramid convolution and residual connection structure to enhance the semantic information of the feature pyramid. In addition, the anchor-free head is used for classification and bounding box regression. The experimental results on the UAVDT dataset show that the proposed RFEPNet achieves state-of-the-art performance.
- Majid Azimi S. ShuffleDet: Real-Time Vehicle Detection Network in On-board Embedded UAV Imagery[C]. Proceedings of the European Conference on Computer Vision (ECCV) Workshops, 2018: 0-0.Google Scholar
- Alotaibi E T, Alqefari S S, and Koubaa A J I A. LSAR: Multi-UAV Collaboration for Search and Rescue Missions[J].2019. IEEE Access, 2019, 7: 55817-55832. https://doi.org/10.1109/ACCESS.2019.2912306Google ScholarCross Ref
- Lecun Y, Bengio Y J T H O B T, and Networks N. Convolutional Networks for Images, Speech, and Time-Series[J].1995. The handbook of brain theory and neural networks, 1995, 3361(10): 1995Google Scholar
- Girshick R, Donahue J, Darrell T, and Malik J. Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation[C]. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2014: 580-587.Google Scholar
- Redmon J, Divvala S, Girshick R, and Farhadi A. You Only Look Once: Unified, Real-Time Object Detection[C]. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016: 779-788.Google Scholar
- Wang X, Zhang S, Yu Z, Feng L, and Zhang W. Scale-Equalizing Pyramid Convolution for Object Detection[C]. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020: 13359-13368.Google Scholar
- He K, Zhang X, Ren S, and Sun J. Deep Residual Learning for Image Recognition[C]. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016: 770-778.Google Scholar
- Ren S, He K, Girshick R, and Sun J. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks[J].2017. IEEE Trans Pattern Anal Mach Intell, 2017, 39(6): 1137-1149. https://doi.org/10.1109/tpami.2016.2577031Google ScholarDigital Library
- Lin T-Y, Dollár P, Girshick R, He K, Hariharan B, and Belongie S. Feature Pyramid Networks for Object Detection[C]. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017: 2117-2125.Google Scholar
- Cai Z, and Vasconcelos N. Cascade R-CNN: Delving Into High Quality Object Detection[C]. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018: 6154-6162.Google Scholar
- Redmon J, and Farhadi A J a P A. YOLOv3: An Incremental Improvement[J].2018. arXiv preprint, 2018, arXiv:1804.02767Google Scholar
- Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu C-Y, and Berg A C. SSD: Single Shot MultiBox Detector[C]. European Conference on Computer Vision, 2016: 21–37.Google Scholar
- Lin T-Y, Goyal P, Girshick R, He K, and Dollár P. Focal Loss for Dense Object Detection[C]. Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2017: 2980-2988.Google Scholar
- Tan M, Pang R, and Le Q V. EfficientDet: Scalable and Efficient Object Detection[C]. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020: 10781-10790.Google Scholar
- Law H, and Deng J. CornerNet: Detecting Objects as Paired Keypoints[C]. Proceedings of the European Conference on Computer Vision (ECCV), 2018: 734-750.Google Scholar
- Zhou X, Zhuo J, and Krahenbuhl P. Bottom-Up Object Detection by Grouping Extreme and Center Points[C]. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019: 850-859.Google Scholar
- Tian Z, Shen C, Chen H, and He T. FCOS: Fully Convolutional One-Stage Object Detection[C]. Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2019: 9627-9636.Google Scholar
- Yang J, Xie X, Shi G, and Yang W. A Feature-Enhanced Anchor-Free Network for UAV Vehicle Detection[J].2020. Remote Sensing, 2020, 12(17): 2729. https://doi.org/10.3390/rs12172729Google ScholarCross Ref
- Wang H, Wang Z, Jia M, Li A, Feng T, Zhang W, and Jiao L. Spatial Attention for Multi-Scale Feature Refinement for Object Detection[C]. Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2019: 0-0.Google Scholar
- Liu M, Wang X, Zhou A, Fu X, Ma Y, and Piao C. UAV-YOLO: Small Object Detection on Unmanned Aerial Vehicle Perspective[J].2020. Sensors, 2020, 20(8): 2238. https://doi.org/10.3390/s20082238Google ScholarCross Ref
- Zhang P, Zhong Y, and Li X. SlimYOLOv3: Narrower, Faster and Better for Real-Time UAV Applications[C]. Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2019: 0-0.Google Scholar
- Du D, Qi Y, Yu H, Yang Y, Duan K, Li G, Zhang W, Huang Q, and Tian Q. The Unmanned Aerial Vehicle Benchmark: Object Detection and Tracking[C]. Proceedings of the European Conference on Computer Vision (ECCV), 2018: 370-386.Google Scholar
- Lin T-Y, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollár P, and Zitnick C L. Microsoft COCO: Common Objects in Context[C]. European Conference on Computer Vision, 2014: 740-755.Google Scholar
- Liu S, Qi L, Qin H, Shi J, and Jia J. Path Aggregation Network for Instance Segmentation[C]. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018: 8759-8768.Google Scholar
- Zhang X, Wan F, Liu C, Ji X, and Ye Q. Learning to Match Anchors for Visual Object Detection[J].2021. IEEE Trans Pattern Anal Mach Intell, 2021, Pp. https://doi.org/10.1109/tpami.2021.3050494Google Scholar
- Zhu C, He Y, and Savvides M. Feature Selective Anchor-Free Module for Single-Shot Object Detection[C]. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019: 840-849.Google Scholar
- Yang Z, Liu S, Hu H, Wang L, and Lin S. RepPoints: Point Set Representation for Object Detection[C]. Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2019: 9657-9666.Google Scholar
Index Terms
- An Anchor-free Detector Based on Residual Feature Enhancement Pyramid Network for UAV Vehicle Detection
Recommendations
An Anchor-free Small Object Detection Algorithm Based On Feature Enhancement And Feature Fusion
ICCAI '23: Proceedings of the 2023 9th International Conference on Computing and Artificial IntelligenceSmall object detection has the problem of not being able to obtain enough semantic information and rich detail information at the same time, and it is prone to missed detection and false detection. Based on this, we propose an anchor-free small object ...
An improved anchor-free method for traffic scene object detection
AbstractDue to the low detection accuracy of most anchor-free detectors and the slow detection speed of anchor-based detectors. Therefore, to balance the detection accuracy and speed of traffic scene objects, a new anchor-free detector called FABNet is ...
Latent Feature Pyramid Network for Object Detection
Object detection methods based on Convolution Neural Networks (CNN) usually utilize feature pyramid networks to detect objects with various scales. The state-of-the-art feature pyramid networks improve detection accuracy by enhancing multi-level feature ...
Comments