Facial pose from 3D data
Introduction
Although 3D face recognition systems have not received very wide attention in the literature, new technologies are likely to make them more interesting and practical in the near future. Such systems have the potential of being more accurate than current 2D systems since 3D image capture eliminates the projection uncertainties associated with head pose in conventional 2D capture. Nevertheless, head pose would still remain an issue.
The apparent 3D shape of a human face undergoes considerable change across different poses over the view-sphere and will lead to inaccurate matching in face recognition systems. It is known that an estimate of facial pose can improve face recognition accuracy considerably. In this paper, we outline a learning method to predict facial pose from a 3D scan using only shape information. The method is efficient and generic and uses the well-known technique of support vector regression. We also suggest a method using the discrete wavelet transform that enhances pose-specific details to improve the accuracy of pose estimates.
The estimate of facial pose obtained by our algorithm serves as a coarse prediction of the actual facial pose within a certain margin of error. This pose estimate has been used as an initial condition for an automated view-invariant 3D face recognition algorithm [20]. The aforementioned system incorporates the estimate of facial pose and uses it for iterative pose-refinement.
The paper outline is as follows: Section 2 gives an overview of existing methods for prediction of facial pose from 2D to 3D images. Section 3 provides some background material on the method of support vector regression, as well as the outline of the method employed, including the training and testing procedure. Section 4 describes the experimental results. Conclusions and directions for future research are given in Section 5, which includes a short description of an application of the method proposed in this paper for a fully view-invariant 3D face recognition system.
Section snippets
Background
Several approaches have been proposed in the literature for face pose estimation from 2D images. These can be broadly categorized into feature- and appearance-based methods. The former [1] relate facial pose to spatial arrangements of significant facial features such as the eyes or nose. A common method is to calculate the facial pose from the angle subtended by the line joining the two eyes with respect to the chosen axes (for example [1]). In another approach, Choi et al. [24] use the EM
Theory: support vector regression
Support Vector Machines have emerged as a powerful classification and regression technique in computer vision [13], [14]. They are based on the principle of structural risk minimization [12]. Consider a set of l input patterns denoted as x, with their corresponding class-labels, denoted by the vector y. A support vector machine obtains a functional approximation given as f(x,α)=w·Φ(x)+b, where Φ is a mapping function from the original space of samples onto a higher-dimensional space, b is a
Angle estimates
Initial experiments were performed with the angular sampling size δ to determine its optimum value. Fig. (3) illustrates the variation of regression accuracy with respect to δ. Clearly, the mean error in pose estimation bears a direct linear relationship to the angular sampling. A value of δ=3 degrees provides the best performance. Thus, in all the experiments reported below, angular sampling is set to 3° to obtain as accurate a pose model as possible, albeit at the cost of greater training
Conclusion
We have proposed a method for generic estimation of 3D facial pose in terms of angle of rotation about the Y-axis and X-axis. The results obtained on a large dataset obtained from two different sources are consistent. The main advantage of the method is that the pose of any 3D facial scan can be predicted to a good degree of accuracy using just a small number of training faces.
Such a module was used in the creation of an automated 3D face recognition system that can normalize a facial scan in
Acknowledgements
Martin Levine would like to thank the Natural Sciences and Engineering Research Council of Canada for its financial support. The authors would also like to thank the University of Freiburg [15] and Notre Dame University [18] for providing us with 3D face databases.
References (25)
- et al.
Building three-dimensional head models
Graphical Models
(2001) - et al.
Support vector machine based multi-view face detection and recognition
Image and Vision Computing
(2004) - K. Hattori, S. Matsumori, Y. Sato, Estimating pose of human face based on symmetry plane using range and intensity...
- A. Pentland, B. Moghaddam, T. Starner, View-based and modular eigenspaces for face recognition, Proceedings of the IEEE...
- S. Srinivasan, K. Boyer, Head-pose estimation using view-based eigenspaces, Proceedings of the 16th International...
- M. Motwani, Q. Ji, 3D face pose discrimination using wavelets, Proceedings of the International Conference on Image...
- Y. Wei, L. Fradet, T. Tan, Head pose estimation using gabor-eigenspace modeling, Proceedings of the International...
- Y. Li, S. Gong, H. Liddell, Support vector regression and classification based multi-view face detection and...
- J. Huang, X. Shao, H. Wechsler, Face pose discrimination using support vector machines, Proceedings of the 14th...
- et al.
Kernel based machine learning for multi-view face detection and pose estimation
Proceedings of the International Conference on Computer Vision
(2001)
Theory for multiresolution signal decomposition: the wavelet representation
IEEE Transactions on Pattern Analysis and Machine Intelligence
Cited by (12)
Coarse head pose estimation of construction equipment operators to formulate dynamic blind spots
2012, Advanced Engineering InformaticsCitation Excerpt :Li et al. [36] uses PCA on intensity images to perform dimensional reduction and then regression. A similar approach has been followed by Rajwade and Levine [35], but they applied 2D techniques on range data. Hence [35,36] both choose the significant principal components for regression.
A training-free nose tip detection method from face range images
2011, Pattern RecognitionCitation Excerpt :These methods are largely not pose invariant, relatively slow, and sometimes need to estimate the second order derivatives of a range image when computing the H and K curvatures and the principal curvatures, which is error-prone because of the noise in the image. Secondly, many existing methods are training based [7–11,22,24,28] or model based [3,20,21,23,27]: Training is laborious, and only works well for limited poses, as has been reported in [30]. Furthermore, many training-based methods work well on frontal poses and are not suitable for profile views.
3D face verification across pose based on euler rotation and tensors
2018, Multimedia Tools and ApplicationsDriver Head Pose Estimation by Regression
2016, Lecture Notes in MobilityAn improved approach for depth data based face pose estimation using particle swarm optimization
2014, VISAPP 2014 - Proceedings of the 9th International Conference on Computer Vision Theory and ApplicationsMulti-camera head pose estimation
2012, Machine Vision and Applications