Skip to main content

APH-YOLOv7t: A YOLO Attention Prediction Head for Search and Rescue with Drones

  • Conference paper
  • First Online:
Robot 2023: Sixth Iberian Robotics Conference (ROBOT 2023)

Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 978))

Included in the following conference series:

  • 19 Accesses

Abstract

Inspection and intervention by drones in rescue operations have growing attention due to multiple causes, including natural and man-related events. Additionally, the rapid advancements in vision sensors, object detection models, and AI-based methods can boost the success of rescue scenarios. Empowering Search and Rescue through affordable and cheaper drone technology is the main motivation. Detecting the missing persons with drones is the key aspect in this context. Drone navigation involves object scale variations creating a computation load for the scene urge high-speed processing. To solve the two issues mentioned above, we propose the APH-YOLOv7t method that follows Holdout method. In this paper, we introduce a new Attention-based Prediction Head for YOLOv7-tiny. We also present the evaluation results of YOLOv7 the state-of-the-art convolutional neural network-based solution, here is used for robust object detection. In this context of drone navigation there is a need to perform detection of persons on land and sea surfaces allowing to reduce disaster, distress, identify and rescue them. Despite the higher success rate of object detection models, vision complexities make detection tasks on drone-captured images more challenging and this area remains under-explored. We used the existing three search and rescue datasets which are images acquired from drones specific to our objective. Results show that our APH-YOLOv7t method was the most robust attention-based YOLO and comprehensive person detection method for our application, demonstrating a consistently high level of performance in comparison to YOLOv7-tiny. Evaluation results on all three datasets are reported. With this solution, and conditional performance, we demonstrate to be able to satisfy our requirements of a mean average precision (mAP50) of over 0.80 for the person class and operational performance with over 125 fps on a single GPU Nvidia RTX2080Ti.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 189.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 279.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://github.com/lacmus-foundation/lacmus.

  2. 2.

    https://roboflow.com/.

  3. 3.

    https://cocodataset.org/.

References

  1. Božić-Štulić, D., Marušić, Ž, Gotovac, S.: Deep learning approach in aerial imagery for supporting land search and rescue missions. Int. J. Comput. Vision 127(9), 1256–1278 (2019)

    Article  Google Scholar 

  2. Cafarelli, D., et al.: MOBDrone: a drone video dataset for man overboard rescue. In: Sclaroff, S., Distante, C., Leo, M., Farinella, G.M., Tombari, F. (eds.) International Conference on Image Analysis and Processing, pp. 633–644. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-06430-2_53

  3. Caputo, S., Castellano, G., Greco, F., Mencar, C., Petti, N., Vessio, G.: Human detection in drone images using YOLO for search-and-rescue operations. In: Bandini, S., Gasparini, F., Mascardi, V., Palmonari, M., Vizzari, G. (eds.) International Conference of the Italian Association for Artificial Intelligence, pp. 326–337. Springer, Cham (2021). https://doi.org/10.1007/978-3-031-08421-8_22

  4. Ciccone, F., Bacciaglia, A., Ceruti, A.: Methodology for image analysis in airborne search and rescue operations. In: Gerbino, S., Lanzotti, A., Martorelli, M., Mirálbes Buil, R., Rizzi, C., Roucoules, L. (eds.) International Joint Conference on Mechanics, Design Engineering & Advanced Manufacturing. pp. 815–826. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-15928-2_71

  5. Dousai, N.M.K., Loncaric, S.: Detection of humans in drone images for search and rescue operations. In: Proceedings of the 2021 3rd Asia Pacific Information Technology Conference, pp. 69–75 (2021)

    Google Scholar 

  6. Girshick, R.: Fast R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1440–1448 (2015)

    Google Scholar 

  7. Gordienko, Y., Rokovyi, O., Alienin, O., Stirenko, S.: Context-aware data augmentation for efficient object detection by UAV surveillance. In: 2022 10th International Symposium on Digital Forensics and Security (ISDFS), pp. 1–6. IEEE (2022)

    Google Scholar 

  8. Gotovac, S., Zelenika, D., Marušić, Ž, Božić-Štulić, D.: Visual-based person detection for search-and-rescue with UAS: humans vs. machine learning algorithm. Remote Sens. 12(20), 3295 (2020)

    Article  Google Scholar 

  9. Hearst, M.A., Dumais, S.T., Osuna, E., Platt, J., Scholkopf, B.: Support vector machines. IEEE Intell. Syst. Appl. 13(4), 18–28 (1998)

    Article  Google Scholar 

  10. Kaufmann, E., Loquercio, A., Ranftl, R., Dosovitskiy, A., Koltun, V., Scaramuzza, D.: Deep drone racing: learning agile flight in dynamic environments. In: Conference on Robot Learning, pp. 133–145. PMLR (2018)

    Google Scholar 

  11. Kousik, N., Natarajan, Y., Raja, R.A., Kallam, S., Patan, R., Gandomi, A.H.: Improved salient object detection using hybrid convolution recurrent neural network. Expert Syst. Appl. 166, 114064 (2021)

    Article  Google Scholar 

  12. Lin, T.Y., Goyal, P., Girshick, R., He, K., Dollár, P.: Focal loss for dense object detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2980–2988 (2017)

    Google Scholar 

  13. Liu, W., et al.: SSD: single shot multibox detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 21–37. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_2

    Chapter  Google Scholar 

  14. Lyu, M., Zhao, Y., Huang, C., Huang, H.: Unmanned aerial vehicles for search and rescue: a survey. Remote Sens. 15(13), 3266 (2023)

    Article  Google Scholar 

  15. Murphy, R., Griffin, C., Stover, S., Pratt, K.: Use of micro air vehicles at hurricane Katrina. In: IEEE Workshop on Safety Security Rescue Robots (2006)

    Google Scholar 

  16. Murphy, R.R.: Disaster Robotics. MIT press (2014)

    Google Scholar 

  17. Patrik, A., et al.: GNSS-based navigation systems of autonomous drone for delivering items. J. Big Data 6, 1–14 (2019)

    Article  Google Scholar 

  18. Poddar, N., Jain, S.: Light weight character and shape recognition for autonomous drones. arXiv preprint arXiv:2208.06804 (2022)

  19. Qin, X., Zhang, Z., Huang, C., Gao, C., Dehghan, M., Jagersand, M.: BASNet: boundary-aware salient object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7479–7489 (2019)

    Google Scholar 

  20. Rahnemoonfar, M., Chowdhury, T., Sarkar, A., Varshney, D., Yari, M., Murphy, R.R.: FloodNet: a high resolution aerial imagery dataset for post flood scene understanding. IEEE Access 9, 89644–89654 (2021)

    Article  Google Scholar 

  21. Redmon, J., Farhadi, A.: Yolov3: an incremental improvement. arXiv preprint arXiv:1804.02767 (2018)

  22. Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems, vol. 28 (2015)

    Google Scholar 

  23. Sambolek, S., Ivasic-Kos, M.: Search and rescue image dataset for person detection - sard (2021)

    Google Scholar 

  24. Schilling, F., Schiano, F., Floreano, D.: Vision-based drone flocking in outdoor environments. IEEE Robot. Autom. Lett. 6(2), 2954–2961 (2021)

    Article  Google Scholar 

  25. Shannon, L.: DJI drones helped track and stop the notre dame fire the verge (2019)

    Google Scholar 

  26. Tomic, T., et al.: Toward a fully autonomous UAV: research platform for indoor and outdoor urban search and rescue. IEEE Robot. Autom. Mag. 19(3), 46–56 (2012)

    Article  Google Scholar 

  27. Valenti, F., Giaquinto, D., Musto, L., Zinelli, A., Bertozzi, M., Broggi, A.: Enabling computer vision-based autonomous navigation for unmanned aerial vehicles in cluttered GPS-denied environments. In: 2018 21st International Conference on Intelligent Transportation Systems (ITSC), pp. 3886–3891. IEEE (2018)

    Google Scholar 

  28. Varga, L.A., Kiefer, B., Messmer, M., Zell, A.: SeaDronesSee: a maritime benchmark for detecting humans in open water. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 2260–2270 (2022)

    Google Scholar 

  29. Wang, C.Y., Bochkovskiy, A., Liao, H.Y.M.: YOLOv7: trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7464–7475 (2023)

    Google Scholar 

  30. Wang, W., Lai, Q., Fu, H., Shen, J., Ling, H., Yang, R.: Salient object detection in the deep learning era: an in-depth survey. IEEE Trans. Pattern Anal. Mach. Intell. 44(6), 3239–3259 (2021)

    Article  Google Scholar 

  31. Woo, S., Park, J., Lee, J.Y., Kweon, I.S.: CBAM: convolutional block attention module. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 3–19 (2018)

    Google Scholar 

  32. Zhao, J.X., Liu, J.J., Fan, D.P., Cao, Y., Yang, J., Cheng, M.M.: EgNet: edge guidance network for salient object detection. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 8779–8788 (2019)

    Google Scholar 

  33. Zhu, X., Lyu, S., Wang, X., Zhao, Q.: TPH-YOLOv5: improved yolov5 based on transformer prediction head for object detection on drone-captured scenarios. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 2778–2788 (2021)

    Google Scholar 

Download references

Acknowledgements

This work has been supported by PRR Project “Agenda Mobilizadora Sines Nexus” (ref. No. 7113), and by the Portuguese Foundation for Science and Technology (FCT) Ph.D. studentships UI/BD/154587/2023 co-founded by the European Social Fund and by the State Budget of the Portuguese Ministry of Education and Science.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Vamshi Kodipaka .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Kodipaka, V., Marques, L., Cortesão, R., Araújo, H. (2024). APH-YOLOv7t: A YOLO Attention Prediction Head for Search and Rescue with Drones. In: Marques, L., Santos, C., Lima, J.L., Tardioli, D., Ferre, M. (eds) Robot 2023: Sixth Iberian Robotics Conference. ROBOT 2023. Lecture Notes in Networks and Systems, vol 978. Springer, Cham. https://doi.org/10.1007/978-3-031-59167-9_22

Download citation

Publish with us

Policies and ethics