Skip to main content
Log in

YOLO-MSFR: real-time natural disaster victim detection based on improved YOLOv5 network

  • Research
  • Published:
Journal of Real-Time Image Processing Aims and scope Submit manuscript

Abstract

In catastrophic occurrences, the automatic, accurate, and fast object detection of images taken by unmanned aerial vehicles (UAVs) can significantly reduce the time required for manual search and rescue. Victim detection methods, however, are not sufficiently robust for identifying partially obscured and multiscale objects against diverse backgrounds. To overcome this problem, we propose a hybrid-domain attention mechanism algorithm based on YOLOv5 with multiscale feature reuse (YOLO-MSFR). First, to solve the issue of the victim target being easily masked by a complex background, which complicates target attribute representation, a channel-space domain attention method was built, to improve target feature expression ability. Second, to address the problem of easily missed multiscale characteristics of victim targets, a multiscale feature reuse MSFR module was designed to ensure that large-scale target features are effectively expressed while enhancing small-target feature expression ability. The MSFR module was designed based on dilated convolution to solve the problem of small-target feature information loss during downsampling in the backbone network. The features were reused by cascade residuals to reduce the model training parameters and avoid the disappearance of the network deepening gradient. Finally, the efficient intersection over union (EIOU) loss function was adopted to accelerate network convergence and improve the detection performance of the network to accurately locate victims. The proposed algorithm was compared with five classical target detection algorithms on data from multiple disaster environments to verify its advantages. The experimental results show that the proposed algorithm can accurately detect multiscale victim targets in complex natural disasters, with mAP reaching 91.0%. The image detection speed at 640 × 640 resolution was 42 fps, indicating good real-time performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Algorithm
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

Data availability

Not applicable.

Abbreviations

MSFR:

Multiscale feature reuse

CNN:

Convolutional neural network

R-CNN:

Region-CNN

YOLO:

You only look once

RNN:

Recurrent neural network

CV:

Computer vision

SSD:

Single shot detector

CBAM:

Convolutional block attention module

FPN:

Feature pyramid network

MLP:

Multilayer perceptron

FPS:

Frames per second

IoU:

Intersection over union

TP:

True positive

TN:

True negative

FN:

False negative

FP:

False positive

References

  1. Avola, D., Cannistraci, I., Cascio, M., Cinque, L., Diko, A., Fagioli, A., Foresti, G.L., Lanzino, R., Mancini, M., Mecca, A., Pannone, D.: A Novel GAN-Based Anomaly Detection and Localization Method for Aerial Video Surveillance at Low Altitude. Remote Sens. 14, 4110 (2022). https://doi.org/10.3390/rs14164110

    Article  ADS  Google Scholar 

  2. Kin, J.S., Lee, Y., Lee, H.J.: Fast ROI Detection for Speed up in a CNN based Object Detection. Journal of Multimedia Information System. 6, 203–208 (2019)

    Article  Google Scholar 

  3. Lee, H.L., Kim, Y., Kim, B.G.: A Survey for 3D Object Detection Algorithms from Images. Journal of Multimedia Information System. 9, 183–190 (2022)

    Article  Google Scholar 

  4. Fu, R., He, J., Liu, G., Li, W., Mao, J., He, M., Lin, Y.: Fast Seismic Landslide Detection Based on Improved Mask R-CNN. Remote Sens. 14, 3928 (2022). https://doi.org/10.3390/rs14163928

    Article  ADS  Google Scholar 

  5. Kgirshick, R., Donahue, J., Darrell, T.: Rich feature hierarchies for accurate object detection and semantic segmentation. IEEE Conference on Computer Vision and Pattern Recognition. 55, 580–587 (2014)

    Google Scholar 

  6. R Girshick. Fast R-CNN. IEEE International Conference on Computer Vision. 2015, 1440–1448.

  7. Ren, S., He, M., Girshick, R.: Faster R-CNN: Towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39, 1137–1149 (2017)

    Article  PubMed  Google Scholar 

  8. Zhang, J., Chen, L., Li, Z.: Pedestrian head detection algorithm based on clustering and Faster RCNN. Journal of Northwestern University (Natural Science Edition). 50, 971–978 (2020)

    Google Scholar 

  9. Huang, P., Shi, H., Gao, Y.: A multi-scale Faster-RCNN detection algorithm for small targets. Computer Research and Development. 56, 319–327 (2019)

    Google Scholar 

  10. Liu, W., Anguelov, D., Erhan, D.: SSD: Single shot multibox detector. Proceedings of European Conference on Computer Vision. 9905, 21–37 (2016)

    Google Scholar 

  11. Redmon, J., Divvala, S., Girshick, R.: You only look once: unified, real-time object detection. IEEE Conference on Computer Vision and Pattern. 55, 779–788 (2016)

    Google Scholar 

  12. Redmon, J., Farhadi, A.: YOLO9000: better, faster, stronger. IEEE Conference on Computer Vision and Pattern Recognition. 55, 6517–6525 (2017)

    Google Scholar 

  13. A Bochkovskiy, Y Wang, Y Liao. YOLOv4: Optimal speed and accuracy of object detection. arXiv preprint, arXiv: 2004.10934, 2020.

  14. R Hartawan, W Purboyo. Disaster Victims Detection System Using Convolutional Neural Network (CNN) Method. IEEE International Conference on Industry 4.0, Artificial Intelligence, and Communications Technology (IAICT). 2019, 105–111.

  15. J Dong Otaka, M Dong. UAV-Based Real-Time Survivor Detection System in Post-Disaster Search and Rescue Operations. IEEE Journal on Miniaturization for Air and Space Systems.2022, 2, 209–219.

  16. Ma, X., Zhang, Z., Zhang, W.: SDWBF Algorithm. Novel Pedestrian Detection Algorithm in the Aerial Scene. (2022). https://doi.org/10.3390/drones6030076

    Article  Google Scholar 

  17. Zhang, N., Nex, F., Vosselman, G.: Training a Disaster Victim Detection Network for UAV Search and Rescue Using Harmonious Composite Images. Remote Sens. 14, 2977 (2020)

    Article  ADS  Google Scholar 

  18. S Woo, J Park, Y Lee. CBAM: Convolutional block attention module. Proceedings of the 15th European Conference on Computer Vision. 2018, 3–19.

  19. F, Zhang; Q, Ren; Z, Zhang. Focal and Efficient IOU Loss for Accurate Bounding Box Regression. arXiv preprint arXiv: 2101. 08158(2021).

  20. Y Wang, L Mark, HYH Wu. CSPNet: a new backbone that can enhance the learning capability of CNN. IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). 2020, 1571–1580.

  21. S Liu, L Qi, F Qin. Path aggregation network for instance segmentation. IEEE Conference on Computer Vision and Patten Recognition. 2017, 8759–8768.

  22. Y Lin, P Dollar, R Girshick. Feature Pyramid Networks for Object Detection. IEEE Conference on Computer Vision and Pattern Recognition. 2017, 936–944.

  23. Park, H.J., Kang, J.W., Kim, B.G.: ssFPN: Scale Sequence (S2) Feature-Based Feature Pyramid Network for Object Detection. Sonsors. 9, 183–190 (2022)

    Google Scholar 

  24. Zheng, H., Wang, P., Ren, W.: Enhancing Geometric Factors. IEEE Transactions on Cybernetics. 52, 8574–8586 (2021)

    Article  Google Scholar 

  25. Wu, B., Wei, Y., LM, H.: Improved YOLOv4 for dangerous goods detection in X-ray inspection combined with atrous convolution and transfer learning. China Optics. 14, 1117–1125 (2021)

    ADS  Google Scholar 

Download references

Funding

This research was funded by National Natural Science Foundation of China (51804250); China Postdoctoral Science Foundation (2019M653874XB, 2020M683522); Scientific Research Program of Shaanxi Provincial Department of Education (18JK0512); Natural Science Basic Research Program of Shaanxi (2021JQ-572, 2020JQ-757); Innovation Capability Support Program of Shaanxi (2020TD-021).

Author information

Authors and Affiliations

Authors

Contributions

Conceptualization, S.H.; methodology, Q.Z.; software, X.M.; validation, Y.W., S.G.; formal analysis, C.Y.; investigation, T.H.; resources, Q.Z.; data curation, Q.Z.; writing—original draft preparation, S.H. and Q.Z.; writing—review and editing, S.H. and Q.Z.; visualization, S.H. and Q.Z.; supervision, S.H.; project administration, S.H.; funding acquisition, S.H. All authors have read and agreed to the published version of the manuscript.

Corresponding author

Correspondence to Xu Ma.

Ethics declarations

Conflicts of interest

The authors declare no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Hao, S., Zhao, Q., Ma, X. et al. YOLO-MSFR: real-time natural disaster victim detection based on improved YOLOv5 network. J Real-Time Image Proc 21, 7 (2024). https://doi.org/10.1007/s11554-023-01383-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11554-023-01383-8

Keywords

Navigation