Synergistic Face Detection and Pose Estimation with Energy-Based Models

Osadchy, Margarita; Le Cun, Yann; Miller, Matthew L.

doi:10.1007/11957959_10

Margarita Osadchy²⁰,
Yann Le Cun²¹ &
Matthew L. Miller²²

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 4170))

2790 Accesses
13 Citations

Abstract

We describe a novel method for real-time, simultaneous multi-view face detection and facial pose estimation. The method employs a convolutional network to map face images to points on a manifold, parametrized by pose, and non-face images to points far from that manifold. This network is trained by optimizing a loss function of three variables: image, pose, and face/non-face label. We test the resulting system, in a single configuration, on three standard data sets – one for frontal pose, one for rotated faces, and one for profiles – and find that its performance on each set is comparable to previous multi-view face detectors that can only handle one form of pose variation. We also show experimentally that the system’s accuracy on both face detection and pose estimation is improved by training for the two tasks together.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

eBook: USD 16.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Bottou, L., LeCun, Y.: The Lush Manual (2002), http://lush.sf.net
Caruana, R.: Multitask learning. Machine Learning 28, 41–75 (1997)
Article Google Scholar
Garcia, C., Delakis, M.: A neural architecture for fast and robust face detection. In: IEEE-IAPR Int. Conference on Pattern Recognition, pp. 40–43 (2002)
Google Scholar
Huang, F.J., LeCun, Y.: Loss functions for discriminative training of energy-based graphical models. Technical report, Courant Institute of Mathematical Science, NYU (June 2004)
Google Scholar
Jones, M., Viola, P.: Fast multi-view face detection. Technical Report TR2003-96, Mitsubishi Electric Research Laboratories (2003)
Google Scholar
LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998)
Article Google Scholar
Li, S.Z., Zhu, L., Zhang, Z., Blake, A., Zhang, H., Shum, H.: Statistical learning of multi-view face detection. In: Heyden, A., Sparr, G., Nielsen, M., Johansen, P. (eds.) ECCV 2002. LNCS, vol. 2353, pp. 67–81. Springer, Heidelberg (2002)
Chapter Google Scholar
Li, Y., Gong, S., Liddell, H.: Support vector regression and classification based multi-view face detection and recognition. In: Face and Gesture (2000)
Google Scholar
Moon, H., Miller, M.L.: Estimating facial pose from sparse representation. In: International Conference on Image Processing, Singapore (2004)
Google Scholar
Pentland, A., Moghaddam, B., Starner, T.: View-based and modular eigenspaces for face recognition. In: CVPR (1994)
Google Scholar
Rowley, H.A., Baluja, S., Kanade, T.: Neural network-based face detection. PAMI 20, 22–38 (1998)
Google Scholar
Rowley, H.A., Baluja, S., Kanade, T.: Rotation invariant neural network-based face detection. In: Computer Vision and Pattern Recognition (1998)
Google Scholar
Schneidermn, H., Kanade, T.: A statistical method for 3d object detection applied to faces and cars. In: Computer Vision and Pattern Recognition (2000)
Google Scholar
Sung, K., Poggio, T.: Example-based learning of view-based human face detection. PAMI 20, 39–51 (1998)
Google Scholar
Vaillant, R., Monrocq, C., LeCun, Y.: Original approach for the localisation of objects in images. IEE Proc. on Vision, Image, and Signal Processing 141(4), 245–250 (1994)
Article Google Scholar
Viola, P., Jones, M.: Rapid object detection using a boosted cascade of simple features. In: Proceedings IEEE Conf. on Computer Vision and Pattern Recognition, pp. 511–518 (2001)
Google Scholar

Download references

Author information

Authors and Affiliations

Computer Science Department, University of Haifa, Mount Carmel, Haifa, 31905, Israel
Margarita Osadchy
The Courant Institute of Mathematical Sciences, New York University, 715 Broadway, New York, NY, 10003, USA
Yann Le Cun
NEC Labs America, 4 Independence Way, Princeton, NJ, 08540, USA
Matthew L. Miller

Authors

Margarita Osadchy
View author publications
You can also search for this author in PubMed Google Scholar
Yann Le Cun
View author publications
You can also search for this author in PubMed Google Scholar
Matthew L. Miller
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Département d’Informatique, Ecole Normale Supérieure, P.O. Box, Paris, France
Jean Ponce
Carnegie Mellon University, Pittsburgh, USA
Martial Hebert
GRAVIR-INRIA, 655 avenue de l’Europe, P.O. Box, 38330, Montbonnot, France
Cordelia Schmid
Department of Engineering Science, University of Oxford, Parks Road, OX1 3PJ, Oxford, UK
Andrew Zisserman

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Osadchy, M., Le Cun, Y., Miller, M.L. (2006). Synergistic Face Detection and Pose Estimation with Energy-Based Models. In: Ponce, J., Hebert, M., Schmid, C., Zisserman, A. (eds) Toward Category-Level Object Recognition. Lecture Notes in Computer Science, vol 4170. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11957959_10

Download citation

DOI: https://doi.org/10.1007/11957959_10
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-68794-8
Online ISBN: 978-3-540-68795-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics