Learning Saliency Features for Face Detection and Recognition Using Multi-task Network

Zhao, Qian; Ge, Shuzhi Sam; Ye, Mao; Liu, Sibang; He, Wei

doi:10.1007/s12369-016-0347-x

Learning Saliency Features for Face Detection and Recognition Using Multi-task Network

Published: 22 March 2016

Volume 8, pages 709–720, (2016)
Cite this article

International Journal of Social Robotics Aims and scope Submit manuscript

Qian Zhao¹,
Shuzhi Sam Ge^2,3,
Mao Ye¹,
Sibang Liu⁴ &
…
Wei He⁵

850 Accesses
8 Citations
3 Altmetric
Explore all metrics

Abstract

In this work, we have proposed a method to learn a type of saliency features, which merely makes response in face regions. Based on the saliency features, a joint pipeline is designed to detect and recognize faces as a part of human–robot interaction (HRI) system of SRU robot. The characteristics of the architecture can be described as follows: (i) In the network, detectors can only be activated by face regions. By convoluting the input image, the detectors can produce a group of saliency feature maps, which indicate the location of faces. (ii) The face representations are achieved by pooling on these high response regions. They enjoy discriminative ability to face identification. Hence, classification and detection can be blended using a single network. (iii) To enhance the saliency of features, false responses are suppressed by introducing a saliency term in loss function, which forces the feature detector to ignore non-face inputs. It also can be seen as a branch of multi-task network to learn background. By restricting false responses, the performance of face verification can be improved, especially when the training and testing are implemented on different dataset. In experiments, the effects of saliency term on face verification and benchmark discriminative ability of saliency features on LFW are analyzed. And the effectiveness of this method in face detection is verified by the experimental results on FDDB.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

TF-SOD: a novel transformer framework for salient object detection

Article 14 March 2022

Deep Salient Object Detection via Hierarchical Network Learning

Dense Face Network: A Dense Face Detector Based on Global Context and Visual Attention Mechanism

Article 29 March 2022

References

Ahonen T, Member S, Hadid A, Pietikainen M, Member S (2006) Face description with local binary patterns: Application to face recognition. In: IEEE Transactions on Pattern Analysis and Machine Intelligence, pp 2037–2041
Benezeth Y, Emile B, Laurent H, Rosenberger C (2010) Vision-based system for human detection and tracking in indoor environment. Int J Soc Robot 2(1):41–52
Article Google Scholar
Bengio Y, Lamblin P, Popovici D, Larochelle H et al (2007) Greedy layer-wise training of deep networks. Adv Neural Inf Process Syst 19:153
Google Scholar
Berg T, Belhumeur PN (2012) Tom-vs-pete classifiers and identity-preserving alignment for face verification. In: BMVC, Citeseer, vol. 2, p 7
Chen D, Cao X, Wang L, Wen F, Sun J (2012) Bayesian face revisited: a joint formulation. In: ECCV 2012, Springer, pp 566–579
Chen D, Cao X, Wen F, Sun J (2013) Blessing of dimensionality: high-dimensional feature and its efficient compression for face verification. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2013, pp 3025 – 3032
Yi D, Lei Z, Liao S, Li SZ (2014) Learning face representation from scratch. eprint arXiv:1411.7923
Hadsell R, Chopra S, Lecun Y (2006) Dimensionality reduction by learning an invariant mapping. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2006, pp 1735–1742
He H, Ge SS, Zhang Z (2011) Visual attention prediction using saliency determination of scene understanding for social robots. Int J Soc Robot 3(4):457–468
Article MathSciNet Google Scholar
He W, Chen Y, Yin Z (2015a) Adaptive neural network control of an uncertain robot with full-state constraints. IEEE Trans Cybern, in press
He W, Ge SS, Li Y, Chew E, Ng YS (2015b) Neural network control of a rehabilitation robot by state and output feedback. J Intell Robot Syst 80(1):15–31
Article Google Scholar
Hinton GE, Salakhutdinov RR (2006) Reducing the dimensionality of data with neural networks. Science (New York, NY) 313(5786):504–547
Article MathSciNet MATH Google Scholar
Huang C, Zhu S, Yu K (2012) Large scale strongly supervised ensemble metric learning, with applications to face verification and retrieval. arXiv preprint arXiv:1212.6094
Huang GB, Learned-Miller E (2014) Labeled faces in the wild: Updates and new reporting procedures. Dept Comput Sci, Univ Massachusetts Amherst, Amherst, MA, USA, Technical Report pp 14–003
Huang GB, Ramesh M, Berg T, Learned-Miller E (2007) Labeled faces in the wild: a database for studying face recognition in unconstrained environments. Technical Report 07-49, University of Massachusetts, Amherst
Jain V, Learned-Miller E (2010) Fddb: A benchmark for face detection in unconstrained settings. Technical Report UM-CS-2010-009, University of Massachusetts, Amherst
Lazebnik S, Schmid C, Ponce J (2006) Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In: IEEE conference on computer vision and pattern recognition (CVPR) 2006, IEEE, 2, pp 2169–2178
Lin D, Lu C, Liao R, Jia J (2014a) Learning important spatial pooling regions for scene classification. In: IEEE conference on computer vision and pattern recognition (CVPR) 2014, pp 3726–3733
Lin M, Chen Q, Yan S (2014b) Network in network. In: International conference on learning representations (ICLR) 2014
Liu C, Wechsler H (2002) Gabor feature based classification using the enhanced fisher linear discriminant model. IEEE Trans Image Process 11:467–476
Article Google Scholar
Liu Z, Luo P, Wang X, Tang X (2014) Deep learning face attributes in the wild. Eprint Arxiv
Lowe DG (2004) Distinctive image features from scale-invariant keypoints. Int J Comp Vision 60(2):91–110
Article Google Scholar
Mozos OM, Kurazume R, Hasegawa T (2010) Multi-part people detection using 2d range data. Int J Soc Robot 2(1):31–40
Article Google Scholar
Simonyan K, Parkhi O, Vedaldi A, Zisserman A, Simonyan K, Parkhi O, Vedaldi A, Zisserman A (2013) Fisher vector faces in the wild. In Proceedings of the BMVC pp 8.1–8.11
Sun Y, Wang X, Tang X (2013a) Deep convolutional network cascade for facial point detection. In: IEEE conference on computer vision and pattern recognition (CVPR) 2013, pp 3476–3483
Sun Y, Wang X, Tang X (2013b) Hybrid deep learning for face verification. In: IEEE international conference on computer vision (ICCV) 2013, pp 1489–1496
Sun Y, Wang X, Tang X (2014a) Deep learning face representation by joint identification-verification. Proceedings of neural information processing systems conference (NIPS) 2014
Sun Y, Wang X, Tang X (2014b) Deep learning face representation from predicting 10,000 classes. In: IEEE conference on computer vision and pattern recognition (CVPR) 2014, pp 1891–1898
Taigman Y, Yang M, Ranzato M, Wolf L (2014) Deepface: closing the gap to human-level performance in face verification. In: IEEE conference on computer vision and pattern recognition (CVPR) 2014, pp 1701–1708
Yi Sun XT Xiaogang Wang (2014) Deeply learned face representations are sparse, selective, and robust. In: Proceedings of neural information processing systems conference (NIPS) 2014
Yi Sun XWXT Ding Liang (2015) DeepID3: Face recognition with very deep neural networks. In: Proceedings of neural information processing systems conference (NIPS) 2014
Z Zhang, P Luo, Chen CL, Tang X (2014) Facial landmark detection by deep multi-task learning. Springer International Publishing, New York
Google Scholar

Download references

Acknowledgments

This work was supported by the National Basic Research Program of China (973 Program) under Grant 2014CB744206 and the Fundamental Research Funds for the China Central Universities of UESTC under Grant ZYGX2013Z003.

Author information

Authors and Affiliations

School of Computer Science and Engineering, Center for Robotics, University of Electronic Science and Technology of China, Chengdu, Sichuan, China
Qian Zhao & Mao Ye
Department of Electrical Computer Engineering, National University of Singapore, Singapore, Singapore
Shuzhi Sam Ge
University of Electronic Science and Technology of China, Chengdu, China
Shuzhi Sam Ge
School of Automation Engineering, Center for Robotics, University of Electronic Science and Technology of China, Chengdu, Sichuan, China
Sibang Liu
School of Automation and Electrical Engineering, University of Science and Technology Beijing, Beijing, China
Wei He

Authors

Qian Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Shuzhi Sam Ge
View author publications
You can also search for this author in PubMed Google Scholar
Mao Ye
View author publications
You can also search for this author in PubMed Google Scholar
Sibang Liu
View author publications
You can also search for this author in PubMed Google Scholar
Wei He
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Qian Zhao.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhao, Q., Ge, S.S., Ye, M. et al. Learning Saliency Features for Face Detection and Recognition Using Multi-task Network. Int J of Soc Robotics 8, 709–720 (2016). https://doi.org/10.1007/s12369-016-0347-x

Download citation

Accepted: 03 March 2016
Published: 22 March 2016
Issue Date: November 2016
DOI: https://doi.org/10.1007/s12369-016-0347-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Learning Saliency Features for Face Detection and Recognition Using Multi-task Network

Abstract

Access this article

Similar content being viewed by others

TF-SOD: a novel transformer framework for salient object detection

Deep Salient Object Detection via Hierarchical Network Learning

Dense Face Network: A Dense Face Detector Based on Global Context and Visual Attention Mechanism

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Learning Saliency Features for Face Detection and Recognition Using Multi-task Network

Abstract

Access this article

Similar content being viewed by others

TF-SOD: a novel transformer framework for salient object detection

Deep Salient Object Detection via Hierarchical Network Learning

Dense Face Network: A Dense Face Detector Based on Global Context and Visual Attention Mechanism

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation