Facial pose from 3D data

doi:10.1016/j.imavis.2006.02.010

Image and Vision Computing

Volume 24, Issue 8, 1 August 2006, Pages 849-856

https://doi.org/10.1016/j.imavis.2006.02.010 Get rights and content

Abstract

The distribution of the apparent 3D shape of human faces across the view-sphere is complex, owing to factors such as variations in identity, facial expression, minor occlusions and noise. In this paper, we use the technique of support vector regression on wavelet sub-bands to learn a model relating facial shape (obtained from 3D scanners) to 3D pose in an identity-invariant manner. The proposed method yields an estimation accuracy of 97–99% within an error of +/− 9° on a large set of data obtained from two different sources. The method could be used for pose estimation in a view-invariant face recognition system.

Introduction

Although 3D face recognition systems have not received very wide attention in the literature, new technologies are likely to make them more interesting and practical in the near future. Such systems have the potential of being more accurate than current 2D systems since 3D image capture eliminates the projection uncertainties associated with head pose in conventional 2D capture. Nevertheless, head pose would still remain an issue.

The apparent 3D shape of a human face undergoes considerable change across different poses over the view-sphere and will lead to inaccurate matching in face recognition systems. It is known that an estimate of facial pose can improve face recognition accuracy considerably. In this paper, we outline a learning method to predict facial pose from a 3D scan using only shape information. The method is efficient and generic and uses the well-known technique of support vector regression. We also suggest a method using the discrete wavelet transform that enhances pose-specific details to improve the accuracy of pose estimates.

The estimate of facial pose obtained by our algorithm serves as a coarse prediction of the actual facial pose within a certain margin of error. This pose estimate has been used as an initial condition for an automated view-invariant 3D face recognition algorithm [20]. The aforementioned system incorporates the estimate of facial pose and uses it for iterative pose-refinement.

The paper outline is as follows: Section 2 gives an overview of existing methods for prediction of facial pose from 2D to 3D images. Section 3 provides some background material on the method of support vector regression, as well as the outline of the method employed, including the training and testing procedure. Section 4 describes the experimental results. Conclusions and directions for future research are given in Section 5, which includes a short description of an application of the method proposed in this paper for a fully view-invariant 3D face recognition system.

Section snippets

Background

Several approaches have been proposed in the literature for face pose estimation from 2D images. These can be broadly categorized into feature- and appearance-based methods. The former [1] relate facial pose to spatial arrangements of significant facial features such as the eyes or nose. A common method is to calculate the facial pose from the angle subtended by the line joining the two eyes with respect to the chosen axes (for example [1]). In another approach, Choi et al. [24] use the EM

Theory: support vector regression

Support Vector Machines have emerged as a powerful classification and regression technique in computer vision [13], [14]. They are based on the principle of structural risk minimization [12]. Consider a set of l input patterns denoted as x, with their corresponding class-labels, denoted by the vector y. A support vector machine obtains a functional approximation given as f(x,α)=w·Φ(x)+b, where Φ is a mapping function from the original space of samples onto a higher-dimensional space, b is a

Angle estimates

Initial experiments were performed with the angular sampling size δ to determine its optimum value. Fig. (3) illustrates the variation of regression accuracy with respect to δ. Clearly, the mean error in pose estimation bears a direct linear relationship to the angular sampling. A value of δ=3 degrees provides the best performance. Thus, in all the experiments reported below, angular sampling is set to 3° to obtain as accurate a pose model as possible, albeit at the cost of greater training

Conclusion

We have proposed a method for generic estimation of 3D facial pose in terms of angle of rotation about the Y-axis and X-axis. The results obtained on a large dataset obtained from two different sources are consistent. The main advantage of the method is that the pose of any 3D facial scan can be predicted to a good degree of accuracy using just a small number of training faces.

Such a module was used in the creation of an automated 3D face recognition system that can normalize a facial scan in

Acknowledgements

Martin Levine would like to thank the Natural Sciences and Engineering Research Council of Canada for its financial support. The authors would also like to thank the University of Freiburg [15] and Notre Dame University [18] for providing us with 3D face databases.

References (25)

N. Sarris et al.
Building three-dimensional head models
Graphical Models
(2001)
Y. Li et al.
Support vector machine based multi-view face detection and recognition
Image and Vision Computing
(2004)
K. Hattori, S. Matsumori, Y. Sato, Estimating pose of human face based on symmetry plane using range and intensity...
A. Pentland, B. Moghaddam, T. Starner, View-based and modular eigenspaces for face recognition, Proceedings of the IEEE...
S. Srinivasan, K. Boyer, Head-pose estimation using view-based eigenspaces, Proceedings of the 16th International...
M. Motwani, Q. Ji, 3D face pose discrimination using wavelets, Proceedings of the International Conference on Image...
Y. Wei, L. Fradet, T. Tan, Head pose estimation using gabor-eigenspace modeling, Proceedings of the International...
Y. Li, S. Gong, H. Liddell, Support vector regression and classification based multi-view face detection and...
J. Huang, X. Shao, H. Wechsler, Face pose discrimination using support vector machines, Proceedings of the 14th...
S. Li et al.
Kernel based machine learning for multi-view face detection and pose estimation
Proceedings of the International Conference on Computer Vision
(2001)

S. Malassiotis, M. Strintzis, Real-time head tracking and 3D pose estimation from range data, Proceedings of the...

S. Mallat

Theory for multiresolution signal decomposition: the wavelet representation

IEEE Transactions on Pattern Analysis and Machine Intelligence

(1989)

Cited by (12)

Coarse head pose estimation of construction equipment operators to formulate dynamic blind spots
2012, Advanced Engineering Informatics
Citation Excerpt :
Li et al. [36] uses PCA on intensity images to perform dimensional reduction and then regression. A similar approach has been followed by Rajwade and Levine [35], but they applied 2D techniques on range data. Hence [35,36] both choose the significant principal components for regression.
Several hundred workers die in construction in the United States every year because equipment operators are unable to see their fellow workers during operation of their vehicle. In this paper we propose a step towards improving this situation by providing an automated method based on range imaging for estimating the coarse head orientation of a construction equipment operator. This research utilizes commercially-available low resolution range cameras to measure the continuously changing field-of-view (FOV) of an equipment operator in outdoor construction. This paper presents a methodology to measure so-called dynamic blind spot maps. The dynamic blind spot map is then projected on a known static equipment blind spot map that already exists to each construction vehicle. A robust computational coarse head pose estimation algorithm and results to three different pieces of construction equipment and multiple operators are presented. The developed method has the potential in automatically determining the spaces around vehicles that are currently not in the field-of-view of the vehicle operator thus providing eventually additional means and technology for improving safety in construction.
A training-free nose tip detection method from face range images
2011, Pattern Recognition
Citation Excerpt :
These methods are largely not pose invariant, relatively slow, and sometimes need to estimate the second order derivatives of a range image when computing the H and K curvatures and the principal curvatures, which is error-prone because of the noise in the image. Secondly, many existing methods are training based [7–11,22,24,28] or model based [3,20,21,23,27]: Training is laborious, and only works well for limited poses, as has been reported in [30]. Furthermore, many training-based methods work well on frontal poses and are not suitable for profile views.
Nose tip detection in range images is a specific facial feature detection problem that is highly important for 3D face recognition. In this paper, we propose a nose tip detection method that has the following three characteristics. First, it does not require training and does not rely on any particular model. Second, it can deal with both frontal and non-frontal poses. Finally, it is quite fast, requiring only seconds to process an image of 100–200 pixels (in both x and y dimensions) with a MATLAB implementation. A complexity analysis shows that most of the computations involved in the proposed algorithm are simple. Thus, if implemented in hardware (such as a GPU implementation), the proposed method should be able to work in real time. We tested the proposed method extensively on synthetic image data rendered by a 3D head model and real data using FRGC v2.0 data set. Experimental results show that the proposed method is robust to many scenarios that are encountered in common face recognition applications (e.g., surveillance). A high detection rate of 99.43% was obtained on FRGC v2.0 data set. Furthermore, the proposed method can be used to coarsely estimate the roll, yaw, and pitch angles of the face pose.
3D face verification across pose based on euler rotation and tensors
2018, Multimedia Tools and Applications
Driver Head Pose Estimation by Regression
2016, Lecture Notes in Mobility
An improved approach for depth data based face pose estimation using particle swarm optimization
2014, VISAPP 2014 - Proceedings of the 9th International Conference on Computer Vision Theory and Applications
Multi-camera head pose estimation
2012, Machine Vision and Applications

View all citing articles on Scopus

¹: School of Computer Science.

²: Department of Electrical and Computer Engineering.

View full text