A Self-attention Network for Face Detection Based on Unmanned Aerial Vehicles

Hua, Shunfu; Fan, Huijie; Ding, Naida; Li, Wei; Tang, Yandong

doi:10.1007/978-3-031-13822-5_39

Shunfu Hua^14,15,16,
Huijie Fan^15,16,
Naida Ding^15,16,
Wei Li¹⁴ &
…
Yandong Tang^15,16

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 13456))

Included in the following conference series:

International Conference on Intelligent Robotics and Applications

2650 Accesses

Abstract

Face detection based on Unmanned Aerial Vehicles (UAVs) faces following challenges: (1) scale variation. When the UAVs fly in the air, the size of faces is different owing to the distance, which increases the difficulty of face detection. (2) lack of specialized face detection datasets. It results in a sharp drop in the accuracy of algorithm. To address these two issues, we make full advantage of existing open benchmarks to train our model. However, the gap is too huge when we adapt face detectors from the ground to the air. Therefore, we propose a novel network called Face Self-attention Network (FSN) to achieve high performance. Our method conducts extensive experiments on the standard WIDER FACE benchmark. The experimental results demonstrate that FSN can detect multi-scale faces accurately.

This work is supported by the National Natural Science Foundation of China (61873259, U20A20200,61821005), and the Youth Innovation Promotion Association of Chinese Academy of Sciences (2019203).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 99.00; Price excludes VAT (USA)

Softcover Book: USD 129.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Deep learning-based face detection and recognition on drones

Article 29 May 2022

LightUAV-YOLO: a lightweight object detection model for unmanned aerial vehicle image

Article 30 October 2024

DMA-YOLO: multi-scale object detection method with attention mechanism for aerial images

Article 28 September 2023

References

Cai, Z., Fan, Q., Feris, R.S., Vasconcelos, N.: A unified multi-scale deep convolutional neural network for fast object detection. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9908, pp. 354–370. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46493-0_22
Chapter Google Scholar
Chi, C., Zhang, S., Xing, J., Lei, Z., Li, S.Z., Zou, X.: Selective refinement network for high performance face detection. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 8231–8238 (2019)
Google Scholar
Dai, J., Li, Y., He, K., Sun, J.: R-FCN: object detection via region-based fully convolutional networks. In: Advances in Neural Information Processing Systems, vol. 29 (2016)
Google Scholar
Deng, J., Guo, J., Xue, N., Zafeiriou, S.: ArcFace: additive angular margin loss for deep face recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4690–4699 (2019)
Google Scholar
Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 580–587 (2014)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Google Scholar
Hu, P., Ramanan, D.: Finding tiny faces. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 951–959 (2017)
Google Scholar
Lin, T.Y., Dollár, P., Girshick, R., He, K., Hariharan, B., Belongie, S.: Feature pyramid networks for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2117–2125 (2017)
Google Scholar
Lin, T.Y., Goyal, P., Girshick, R., He, K., Dollár, P.: Focal loss for dense object detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2980–2988 (2017)
Google Scholar
Lin, T.-Y., et al.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_48
Chapter Google Scholar
Liu, W., et al.: SSD: single shot multibox detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 21–37. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_2
Chapter Google Scholar
Najibi, M., Samangouei, P., Chellappa, R., Davis, L.S.: SSH: single stage headless face detector. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 4875–4884 (2017)
Google Scholar
Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 779–788 (2016)
Google Scholar
Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems, vol. 28 (2015)
Google Scholar
Torralba, A., Murphy, K.P., Freeman, W.T., Rubin, M.A.: Context-based vision system for place and object recognition. In: IEEE International Conference on Computer Vision, vol. 2, p. 273. IEEE Computer Society (2003)
Google Scholar
Vaswani, A., et al..: Attention is all you need. In: Advances in Neural Information Processing Systems, vol. 30 (2017)
Google Scholar
Viola, P., Jones, M.J.: Robust real-time face detection. Int. J. Comput. Vis. 57(2), 137–154 (2004)
Article Google Scholar
Wang, J., Yuan, Y., Yu, G.: Face attention network: an effective face detector for the occluded faces. arXiv preprint arXiv:1711.07246 (2017)
Wang, X., Shrivastava, A., Gupta, A.: A-fast-RCNN: hard positive generation via adversary for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2606–2615 (2017)
Google Scholar
Yang, S., Luo, P., Loy, C.C., Tang, X.: Wider face: a face detection benchmark. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5525–5533 (2016)
Google Scholar
Yang, S., Xiong, Y., Loy, C.C., Tang, X.: Face detection through scale-friendly deep convolutional networks. arXiv preprint arXiv:1706.02863 (2017)
Zhang, F., Zhang, T., Mao, Q., Xu, C.: Joint pose and expression modeling for facial expression recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3359–3368 (2018)
Google Scholar
Zhang, S., Zhu, X., Lei, Z., Shi, H., Wang, X., Li, S.Z.: S3FD: single shot scale-invariant face detector. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 192–201 (2017)
Google Scholar

Download references

Author information

Authors and Affiliations

College of Information Engineering, Shenyang University of Technology, Shenyang, 110870, China
Shunfu Hua & Wei Li
State Key Laboratory of Robotics, Shenyang Institute of Automation, Chinese Academy of Sciences, Shenyang, 110016, China
Shunfu Hua, Huijie Fan, Naida Ding & Yandong Tang
Institutes for Robotics and Intelligent Manufacturing, Chinese Academy of Sciences, Shenyang, 110169, China
Shunfu Hua, Huijie Fan, Naida Ding & Yandong Tang

Authors

Shunfu Hua
View author publications
You can also search for this author in PubMed Google Scholar
Huijie Fan
View author publications
You can also search for this author in PubMed Google Scholar
Naida Ding
View author publications
You can also search for this author in PubMed Google Scholar
Wei Li
View author publications
You can also search for this author in PubMed Google Scholar
Yandong Tang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Huijie Fan .

Editor information

Editors and Affiliations

Harbin Institute of Technology, Shenzhen, China
Honghai Liu
Huazhong University of Science and Technology, Wuhan, China
Zhouping Yin
Shenyang Institute of Automation, Shenyang, Liaoning, China
Lianqing Liu
Harbin Institute of Technology, Harbin, China
Li Jiang
Shanghai Jiao Tong University, Shanghai, China
Guoying Gu
Shenzhen Institutes of Advanced Technology, Shenzhen, China
Xinyu Wu
Harbin Institute of Technology, Shenzhen, China
Weihong Ren

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hua, S., Fan, H., Ding, N., Li, W., Tang, Y. (2022). A Self-attention Network for Face Detection Based on Unmanned Aerial Vehicles. In: Liu, H., et al. Intelligent Robotics and Applications. ICIRA 2022. Lecture Notes in Computer Science(), vol 13456. Springer, Cham. https://doi.org/10.1007/978-3-031-13822-5_39

Download citation

DOI: https://doi.org/10.1007/978-3-031-13822-5_39
Published: 04 August 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-13821-8
Online ISBN: 978-3-031-13822-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

A Self-attention Network for Face Detection Based on Unmanned Aerial Vehicles