APH-YOLOv7t: A YOLO Attention Prediction Head for Search and Rescue with Drones

Kodipaka, Vamshi; Marques, Lino; Cortesão, Rui; Araújo, Hélder

doi:10.1007/978-3-031-59167-9_22

Vamshi Kodipaka¹⁴,
Lino Marques¹⁴,
Rui Cortesão¹⁴ &
…
Hélder Araújo¹⁴

Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 978))

Included in the following conference series:

Iberian Robotics conference

19 Accesses

Abstract

Inspection and intervention by drones in rescue operations have growing attention due to multiple causes, including natural and man-related events. Additionally, the rapid advancements in vision sensors, object detection models, and AI-based methods can boost the success of rescue scenarios. Empowering Search and Rescue through affordable and cheaper drone technology is the main motivation. Detecting the missing persons with drones is the key aspect in this context. Drone navigation involves object scale variations creating a computation load for the scene urge high-speed processing. To solve the two issues mentioned above, we propose the APH-YOLOv7t method that follows Holdout method. In this paper, we introduce a new Attention-based Prediction Head for YOLOv7-tiny. We also present the evaluation results of YOLOv7 the state-of-the-art convolutional neural network-based solution, here is used for robust object detection. In this context of drone navigation there is a need to perform detection of persons on land and sea surfaces allowing to reduce disaster, distress, identify and rescue them. Despite the higher success rate of object detection models, vision complexities make detection tasks on drone-captured images more challenging and this area remains under-explored. We used the existing three search and rescue datasets which are images acquired from drones specific to our objective. Results show that our APH-YOLOv7t method was the most robust attention-based YOLO and comprehensive person detection method for our application, demonstrating a consistently high level of performance in comparison to YOLOv7-tiny. Evaluation results on all three datasets are reported. With this solution, and conditional performance, we demonstrate to be able to satisfy our requirements of a mean average precision (mAP50) of over 0.80 for the person class and operational performance with over 125 fps on a single GPU Nvidia RTX2080Ti.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 189.00; Price excludes VAT (USA)

Softcover Book: USD 279.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

References

Božić-Štulić, D., Marušić, Ž, Gotovac, S.: Deep learning approach in aerial imagery for supporting land search and rescue missions. Int. J. Comput. Vision 127(9), 1256–1278 (2019)
Article Google Scholar
Cafarelli, D., et al.: MOBDrone: a drone video dataset for man overboard rescue. In: Sclaroff, S., Distante, C., Leo, M., Farinella, G.M., Tombari, F. (eds.) International Conference on Image Analysis and Processing, pp. 633–644. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-06430-2_53
Caputo, S., Castellano, G., Greco, F., Mencar, C., Petti, N., Vessio, G.: Human detection in drone images using YOLO for search-and-rescue operations. In: Bandini, S., Gasparini, F., Mascardi, V., Palmonari, M., Vizzari, G. (eds.) International Conference of the Italian Association for Artificial Intelligence, pp. 326–337. Springer, Cham (2021). https://doi.org/10.1007/978-3-031-08421-8_22
Ciccone, F., Bacciaglia, A., Ceruti, A.: Methodology for image analysis in airborne search and rescue operations. In: Gerbino, S., Lanzotti, A., Martorelli, M., Mirálbes Buil, R., Rizzi, C., Roucoules, L. (eds.) International Joint Conference on Mechanics, Design Engineering & Advanced Manufacturing. pp. 815–826. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-15928-2_71
Dousai, N.M.K., Loncaric, S.: Detection of humans in drone images for search and rescue operations. In: Proceedings of the 2021 3rd Asia Pacific Information Technology Conference, pp. 69–75 (2021)
Google Scholar
Girshick, R.: Fast R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1440–1448 (2015)
Google Scholar
Gordienko, Y., Rokovyi, O., Alienin, O., Stirenko, S.: Context-aware data augmentation for efficient object detection by UAV surveillance. In: 2022 10th International Symposium on Digital Forensics and Security (ISDFS), pp. 1–6. IEEE (2022)
Google Scholar
Gotovac, S., Zelenika, D., Marušić, Ž, Božić-Štulić, D.: Visual-based person detection for search-and-rescue with UAS: humans vs. machine learning algorithm. Remote Sens. 12(20), 3295 (2020)
Article Google Scholar
Hearst, M.A., Dumais, S.T., Osuna, E., Platt, J., Scholkopf, B.: Support vector machines. IEEE Intell. Syst. Appl. 13(4), 18–28 (1998)
Article Google Scholar
Kaufmann, E., Loquercio, A., Ranftl, R., Dosovitskiy, A., Koltun, V., Scaramuzza, D.: Deep drone racing: learning agile flight in dynamic environments. In: Conference on Robot Learning, pp. 133–145. PMLR (2018)
Google Scholar
Kousik, N., Natarajan, Y., Raja, R.A., Kallam, S., Patan, R., Gandomi, A.H.: Improved salient object detection using hybrid convolution recurrent neural network. Expert Syst. Appl. 166, 114064 (2021)
Article Google Scholar
Lin, T.Y., Goyal, P., Girshick, R., He, K., Dollár, P.: Focal loss for dense object detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2980–2988 (2017)
Google Scholar
Liu, W., et al.: SSD: single shot multibox detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 21–37. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_2
Chapter Google Scholar
Lyu, M., Zhao, Y., Huang, C., Huang, H.: Unmanned aerial vehicles for search and rescue: a survey. Remote Sens. 15(13), 3266 (2023)
Article Google Scholar
Murphy, R., Griffin, C., Stover, S., Pratt, K.: Use of micro air vehicles at hurricane Katrina. In: IEEE Workshop on Safety Security Rescue Robots (2006)
Google Scholar
Murphy, R.R.: Disaster Robotics. MIT press (2014)
Google Scholar
Patrik, A., et al.: GNSS-based navigation systems of autonomous drone for delivering items. J. Big Data 6, 1–14 (2019)
Article Google Scholar
Poddar, N., Jain, S.: Light weight character and shape recognition for autonomous drones. arXiv preprint arXiv:2208.06804 (2022)
Qin, X., Zhang, Z., Huang, C., Gao, C., Dehghan, M., Jagersand, M.: BASNet: boundary-aware salient object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7479–7489 (2019)
Google Scholar
Rahnemoonfar, M., Chowdhury, T., Sarkar, A., Varshney, D., Yari, M., Murphy, R.R.: FloodNet: a high resolution aerial imagery dataset for post flood scene understanding. IEEE Access 9, 89644–89654 (2021)
Article Google Scholar
Redmon, J., Farhadi, A.: Yolov3: an incremental improvement. arXiv preprint arXiv:1804.02767 (2018)
Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems, vol. 28 (2015)
Google Scholar
Sambolek, S., Ivasic-Kos, M.: Search and rescue image dataset for person detection - sard (2021)
Google Scholar
Schilling, F., Schiano, F., Floreano, D.: Vision-based drone flocking in outdoor environments. IEEE Robot. Autom. Lett. 6(2), 2954–2961 (2021)
Article Google Scholar
Shannon, L.: DJI drones helped track and stop the notre dame fire the verge (2019)
Google Scholar
Tomic, T., et al.: Toward a fully autonomous UAV: research platform for indoor and outdoor urban search and rescue. IEEE Robot. Autom. Mag. 19(3), 46–56 (2012)
Article Google Scholar
Valenti, F., Giaquinto, D., Musto, L., Zinelli, A., Bertozzi, M., Broggi, A.: Enabling computer vision-based autonomous navigation for unmanned aerial vehicles in cluttered GPS-denied environments. In: 2018 21st International Conference on Intelligent Transportation Systems (ITSC), pp. 3886–3891. IEEE (2018)
Google Scholar
Varga, L.A., Kiefer, B., Messmer, M., Zell, A.: SeaDronesSee: a maritime benchmark for detecting humans in open water. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 2260–2270 (2022)
Google Scholar
Wang, C.Y., Bochkovskiy, A., Liao, H.Y.M.: YOLOv7: trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7464–7475 (2023)
Google Scholar
Wang, W., Lai, Q., Fu, H., Shen, J., Ling, H., Yang, R.: Salient object detection in the deep learning era: an in-depth survey. IEEE Trans. Pattern Anal. Mach. Intell. 44(6), 3239–3259 (2021)
Article Google Scholar
Woo, S., Park, J., Lee, J.Y., Kweon, I.S.: CBAM: convolutional block attention module. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 3–19 (2018)
Google Scholar
Zhao, J.X., Liu, J.J., Fan, D.P., Cao, Y., Yang, J., Cheng, M.M.: EgNet: edge guidance network for salient object detection. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 8779–8788 (2019)
Google Scholar
Zhu, X., Lyu, S., Wang, X., Zhao, Q.: TPH-YOLOv5: improved yolov5 based on transformer prediction head for object detection on drone-captured scenarios. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 2778–2788 (2021)
Google Scholar

Download references

Acknowledgements

This work has been supported by PRR Project “Agenda Mobilizadora Sines Nexus” (ref. No. 7113), and by the Portuguese Foundation for Science and Technology (FCT) Ph.D. studentships UI/BD/154587/2023 co-founded by the European Social Fund and by the State Budget of the Portuguese Ministry of Education and Science.

Author information

Authors and Affiliations

Institute of Systems and Robotics, Department of Electrical and Computer Engineering, University of Coimbra, Coimbra, Portugal
Vamshi Kodipaka, Lino Marques, Rui Cortesão & Hélder Araújo

Authors

Vamshi Kodipaka
View author publications
You can also search for this author in PubMed Google Scholar
Lino Marques
View author publications
You can also search for this author in PubMed Google Scholar
Rui Cortesão
View author publications
You can also search for this author in PubMed Google Scholar
Hélder Araújo
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Vamshi Kodipaka .

Editor information

Editors and Affiliations

Dep. de Eng. Electrotecnica e de Computadores, University of Coimbra, Coimbra, Portugal
Lino Marques
Department of Electrónica Industrial, University of Minho, Escola de Engenharia, Guimarães, Portugal
Cristina Santos
Department of Electrical Engineering, ESTIG, Polytechnic Institute of Bragança, Bragança, Portugal
José Luís Lima
Centro Universitario de la Defensa, Zaragoza, Spain
Danilo Tardioli
Centre for Automation and Robotics UPM-CSIC, Universidad Politécnica de Madrid, Madrid, Spain
Manuel Ferre

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kodipaka, V., Marques, L., Cortesão, R., Araújo, H. (2024). APH-YOLOv7t: A YOLO Attention Prediction Head for Search and Rescue with Drones. In: Marques, L., Santos, C., Lima, J.L., Tardioli, D., Ferre, M. (eds) Robot 2023: Sixth Iberian Robotics Conference. ROBOT 2023. Lecture Notes in Networks and Systems, vol 978. Springer, Cham. https://doi.org/10.1007/978-3-031-59167-9_22

Download citation

DOI: https://doi.org/10.1007/978-3-031-59167-9_22
Published: 27 April 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-59166-2
Online ISBN: 978-3-031-59167-9
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics

APH-YOLOv7t: A YOLO Attention Prediction Head for Search and Rescue with Drones