Abstract
With the widespread application of unmanned aerial vehicles (UAVs), object detection from the perspective of UAVs has received increasing attention from scholars. The high image density and small objects in images captured by UAVs pose significant challenges for detection. To address this problem, we propose a detection algorithm for UAV images, called SIE-YOLOv5. Based on YOLOv5, we fully explore the downsampling process and propose the Spatial Information Extraction (SIE) structure to fully utilize the information in feature maps, and better detect small objects in images. Then, we improve the feature pyramid pooling structure and propose a new LeakySPPF module, which achieves faster speed while maintaining comparable performance. To better locate attention regions, we incorporate the attention mechanism CBAM [1] into the model. Finally, we also improve the IoU calculation method of the baseline model by introducing Wise-IoU [2] to address calculation issues. Through extensive experiments, we demonstrate that our proposed SIE-YOLOv5 has better small object detection capabilities in UAV-captured scenes. On the VisDrone2021 dataset, the mAP is improved by 6.6%.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Woo, S., Park, J., Lee, J.-Y., Kweon, I.S.: CBAM: convolutional block attention module, CoRR, vol. abs/ arXiv: 1807.06521 (2018)
Tong, Z., Chen, Y., Xu, Z., Yu, R.: Wise-iou: Bounding box regression loss with dynamic focusing mechanism (2023)
Kellenberger, B., Marcos, D., Tuia, D.: Detecting mammals in uav images: Best practices to address a substantially imbalanced dataset with deep learning. Remote Sens. Environ. 216, 139–153 (2018)
Hird, J.N.: Use of unmanned aerial vehicles for monitoring recovery of forest vegetation on petroleum well sites. Remote Sens. 9, 413 (2017)
Lin, T.-Y., et al.: Microsoft COCO: common objects in context, CoRR, vol. abs/ arXiv: 1405.0312 (2014)
Everingham, M., Gool, L.V., Williams, C.K.I., Winn, J.M., Zisserman, A.: The pascal visual object classes (voc) challenge. Int. J. Comput. Vis. 88, 303–338 (2010)
Bochkovskiy, A., Wang, C.-Y., Liao, H.-Y.M.: Yolov4: Optimal speed and accuracy of object detection, CoRR, vol. abs/ arXiv: 2004.10934 (2020)
Wang, C.-Y., Bochkovskiy, A., Liao, H.-Y.M.: Yolov7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors (2022)
Zhang, Y.-F., Ren, W., Zhang, Z., Jia, Z., Wang, L., Tan, T.: Focal and efficient iou loss for accurate bounding box regression (2022)
Zheng, Z., et al.: Enhancing geometric factors in model learning and inference for object detection and instance segmentation. IEEE Trans. Cybern. 52(8), 8574–8586 (2022)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Wen, Z., Su, J., Zhang, Y. (2023). SIE-YOLOv5: Improved YOLOv5 for Small Object Detection in Drone-Captured-Scenarios. In: Jin, Z., Jiang, Y., Buchmann, R.A., Bi, Y., Ghiran, AM., Ma, W. (eds) Knowledge Science, Engineering and Management. KSEM 2023. Lecture Notes in Computer Science(), vol 14118. Springer, Cham. https://doi.org/10.1007/978-3-031-40286-9_4
Download citation
DOI: https://doi.org/10.1007/978-3-031-40286-9_4
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-40285-2
Online ISBN: 978-3-031-40286-9
eBook Packages: Computer ScienceComputer Science (R0)