Abstract
Face detection is a fundamental step for face analysis tasks. In recent years, deep learning-based algorithms in face detection have grown rapidly. Most neural networks are computationally expensive and rely on graphics processing units, falling to be applied in practical applications. This paper explores the principles of designing tiny models and proposes an extremely tiny face detector based on the tiny-YOLOv3 framework, introducing new network structures such as Cross-Stage-Partial-connections (CSP), depthwise convolution, and Spatial Pyramid Pooling (SPP). The number of parameters is less than 10k, and the storage is less than 50Kb by using half-precision float point (FP16) for each parameter. Furthermore, each layer’s peak memory usage is under 0.07MB, leading the model to be accessible to various platforms. The experiments on a subset of the WIDER FACE dataset and Open Images Dataset V4 (OID) show that the proposed face detector can achieve comparable performance to the more massive face detectors heavier in model size and floating-point operations.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Chen, L.-C., Zhu, Y., Papandreou, G., Schroff, F., Adam, H.: Encoder-decoder with Atrous separable convolution for semantic image segmentation. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11211, pp. 833–851. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01234-2_49
Chen, S., Liu, Y., Gao, X., Han, Z.: Mobilefacenets: efficient CNNs for accurate real-time face verification on mobile devices. In: Chinese Conference on Biometric Recognition, pp. 428–438 (2018)
Chen, W., Huang, H., Peng, S., Zhou, C., Zhang, C.: YOLO-face: a real-time face detector. Vis. Comput. 37, 805–813 (2020)
He, K., Zhang, X., Ren, S., Sun, J.: Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans. Pattern Anal. Mach. Intell. 37(9), 1904–1916 (2015)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Howard, A., et al.: MobileNets: efficient convolutional neural networks for mobile vision applications. In: Computer Vision and Pattern Recognition, arXiv (2017)
Huang, G., Liu, Z., Maaten, L.V.D., Weinberger, K.Q.: Densely connected convolutional networks. In: Computer Era (2017)
Kuznetsova, A., Rom, H., Alldrin, N., Uijlings, J., Ferrari, V.: The open images dataset V4: unified image classification, object detection, and visual relationship detection at scale. Int. J. Comput. Vis. 128(4), 1956–1981 (2020)
Li, Z., Peng, C., Yu, G., Zhang, X., Deng, Y., Sun, J.: Light-head R-CNN: in defense of two-stage object detector (2017)
Lin, T., Dollar, P., Girshick, R., He, K., Hariharan, B., Belongie, S.: Feature pyramid networks for object detection. In: Computer Vision and Pattern Recognition, pp. 936–944 (2017)
Liu, W., et al.: SSD: single shot multibox detector. In: European Conference on Computer Vision, pp. 21–37 (2016)
Maksimovic, M., Vujovic, V., Davidovic, N., Milosevic, V., Perisic, B.: Raspberry PI as internet of things hardware: performances and constraints. In: IcETRAN (2014)
Mathias, M., Benenson, R., Pedersoli, M., Van Gool, L.: Face detection without bells and whistles. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8692, pp. 720–735. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10593-2_47
Najibi, M., Samangouei, P., Chellappa, R., Davis, L.S.: SSH: single stage headless face detector. In: International Conference on Computer Vision, pp. 4885–4894 (2017)
Qin, Z., et al.: ThunderNet: towards real-time generic object detection on mobile devices. In: International Conference on Computer Vision, pp. 6718–6727 (2019)
Redmon, J., Farhadi, A.: YOLOv3: an incremental improvement. In: Computer Vision and Pattern Recognition, arXiv (2018)
Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: Neural Information Processing Systems 2015, pp. 91–99 (2015)
Rezatofighi, H., Tsoi, N., Gwak, J.Y., Sadeghian, A., Reid, I., Savarese, S.: Generalized intersection over union: a metric and a loss for bounding box regression. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2020)
Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.: MobileNetv 2: inverted residuals and linear bottlenecks. In: Computer Vision and Pattern Recognition, pp. 4510–4520 (2018)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: The Visual Computer (2014)
Tang, X., Du, D.K., He, Z., Liu, J.: PyramidBox: a context-assisted single shot face detector. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11213, pp. 812–828. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01240-3_49
Viola, P.A., Jones, M.: Rapid object detection using a boosted cascade of simple features. Comput. Vis. Pattern Recogn. 1, 511–518 (2001)
Wang, C.Y., Mark Liao, H.Y., Wu, Y.H., Chen, P.Y., Hsieh, J.W., Yeh, I.H.: CSPNet: a new backbone that can enhance learning capability of CNN. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp. 390–391 (2020)
Wang, C., Liao, H.M., Yeh, I., Wu, Y., Chen, P., Hsieh, J.: CSPNet: a new backbone that can enhance learning capability of CNN. In: Computer Vision and Pattern Recognition, arXiv (2019)
Wang, R.J., Li, X., Ling, C.X.: Pelee: a real-time object detection system on mobile devices. In: Advances in Neural Information Processing Systems, pp. 1963–1972 (2018)
Yang, S., Luo, P., Loy, C.C., Tang, X.: Wider face: a face detection benchmark. In: Computer Vision and Pattern Recognition, pp. 5525–5533 (2016)
Yang, S., Luo, P., Loy, C.C., Tang, X.: Faceness-Net: face detection through deep facial part responses. IEEE Trans. Pattern Anal. Mach. Intell. 40(8), 1845–1859 (2018)
Zhang, C., Xu, X., Tu, D.: Face detection using improved faster RCNN. In: Computer Vision and Pattern Recognition, arXiv (2018)
Zhang, K., Zhang, Z., Li, Z., Qiao, Y.: Joint face detection and alignment using multitask cascaded convolutional networks. IEEE Signal Process. Lett. 23(10), 1499–1503 (2016)
Acknowledgement
FITC: This research was partially supported by National Basic Enhancement Research Program of China under key basic research project, National Natural Science Foundation (NSFC) of China under project No. 61906206, 62071478.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Chen, C., Zhang, M., Peng, Y., Tan, H., Xiao, H. (2021). Extremely Tiny Face Detector for Platforms with Limited Resources. In: Peng, Y., Hu, SM., Gabbouj, M., Zhou, K., Elad, M., Xu, K. (eds) Image and Graphics. ICIG 2021. Lecture Notes in Computer Science(), vol 12889. Springer, Cham. https://doi.org/10.1007/978-3-030-87358-5_29
Download citation
DOI: https://doi.org/10.1007/978-3-030-87358-5_29
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-87357-8
Online ISBN: 978-3-030-87358-5
eBook Packages: Computer ScienceComputer Science (R0)