Abstract
Estimating people’s head pose is an important problem, for which many solutions have been proposed. Most existing solutions are based on the use of a single camera and assume that the head is confined in a relatively small region of space. If we need to estimate unintrusively the head pose of persons in a large environment, however, we need to use several cameras to cover the monitored area. In this work, we propose a novel solution to the multi-camera head pose estimation problem that exploits the additional amount of information that provides multi-camera configurations. Our approach uses the probability estimates produced by multi-class support vector machines to calculate the probability distribution of the head pose. The distributions produced by the cameras are fused, resulting in a more precise estimate than the one provided individually. We report experimental results that confirm that the fused distribution provides higher accuracy than the individual classifiers and a high robustness against errors.
Similar content being viewed by others
References
Ba S.O., Odobez J.-M.: From camera head pose to 3d global room head pose using multiple camera views. Lect. Notes Comput Sci 4625, 276–286 (2008)
Basu, S., Essa, I., Pentland, A.: Motion regularization for model-based head tracking. In: International Conference on Pattern Recognition, pp. 611–616 (1996)
Bishop, C.M.: Pattern recognition and machine learning. Springer, New York (2006)
Black, M., Yacoob, Y.: Tracking and recognizing rigid and non-rigid facial motions using local parametric models of image motion. In: International Conference on Computer Vision, pp. 374–381 (1995)
Bradski, G., Kaehler, A.: Learning OpenCV: Computer Vision with the OpenCV Library. O’Reilly, Sebastopol, CA (2008)
Canton-Ferrer C., Segura C., Casas J.R., Pardas M., Hernando J.: Audiovisual head orientation estimation with particle filtering in multisensor scenarios. EURASIP J. Adv. Signal Process. 4625, 1–12 (2008)
Cascia M.L., Sclaroff S., Athitsos V.: Fast, reliable head tracking under varying illumination: an approach based on registration of textured-mapped 3d models. Pattern Anal. Mach. Intell. 22, 322–336 (2000)
Chen, L., Zhang, L., Hu, Y., Li, M., Zhang, H.: Head pose estimation using fisher manifold learning. In: IEEE International Workshop on Analysis and Modeling of Faces and Gestures, pp. 203–207 (2003)
Choi K.N., Worthington P.L., Hancock E.R.: Estimating facial pose using shape-from-shading. Pattern Recogn. Lett. 23, 533–548 (2002)
Choia S., Kim D.: Robust head tracking using 3d ellipsoidal head model in particle filter. Pattern Recognit. 41, 2901–2915 (2008)
Cootes T.F., Wheeler G.V., Walker K.N., Taylor C.J.: View-based active appearance models. Image Vis. Comput. 20, 657–664 (2002)
Fukunaga K.: Introduction to Statistical Pattern Recognition. Academic Press, New York (1990)
Gee A., Cipolla R.: Fast visual tracking by temporal consensus. Image Vis. Comput. 14, 105–114 (1996)
Weng R.C., Lin H.T., Lin C.J.: A note on platt’s probabilistic outputs for support vector machines. Mach. Learn. 68, 267–276 (2007)
Horprasert, T., Yacoob, Y., Davis, L.S.: Computing 3-D orientation from a monocular image sequence. In: IEEE Conference on automatic face and gesture recognition, pp. 242–247 (1996)
Huang, J., Shao, X., Wechsler, H.: Face pose estimation using support vector machines. In: International Conference on pattern recognition, pp. 154–156 (1998)
Ji Q.: 3D face pose estimation and tracking from a monocular camera. Image Vis. Comput. 20, 499–511 (2002)
Knerr, S., Personnaz, L., Dreyfus, G.: Single-layer learning revisited: a stepwise procedure for building and training a neural network. In: Neurocomputing: Algorithms, Architectures and Applications. NATO ASI Series. Springer, Heidelberg (1990)
Krüger V., Sommer G.: Gabor wavelet networks for efficient head posenext term estimation. Image Vis. Comput. 20, 665–672 (2002)
Lanz O., Brunelli R.: Joint bayesian tracking of head location and pose from low-resolution video. Lect. Notes Comput. Sci. 4625, 287–296 (2008)
Lee M.W., Ranganath S.: Pose-invariant face recognition using a 3D deformable model. Pattern Recognit. 36, 1835–1846 (2003)
Li, Y., Gong, S., Liddell, H.: Support vector regression and classification based multi-view face detection and recognition. In: IEEE International Conference on Automatic Face and Gesture Recognition, pp. 300–305 (2000)
Lin C., Fan K.-C.: Pose classification of human faces by weighting mask function approach. Pattern Recognit. Lett. 24, 1857–1869 (2003)
Malassiotis S., Strintzis M.G.: Robust real-time 3d head posenext term estimation from range data. Pattern Recognit. 38, 1153–1165 (2005)
Muñoz Salinas R.: A bayesian plan-view map based approach for multiple-person detection and tracking. Pattern Recognit. 41, 3665–3676 (2008)
Muñoz Salinas R., García-Silvente M., Medina-Carnicer R.: Adaptive multi-modal stereo people tracking without background modelling. J Vis Commun Image Represent 19, 75–91 (2008)
Muñoz Salinas R., Medina-Carnicer R., Madrid-Cuevas F.J., Carmona-Poyato A.: Multi-camera people tracking using evidential filters. Int J Approx Reason 50, 732–749 (2009)
Muñoz Salinas R., Medina-Carnicer R., Madrid-Cuevas F.J., Carmona-Poyato A.: People detection and tracking with multiple stereo cameras using particle filters. J Vis Commun Image Represent 20, 339–350 (2009)
Rajwade A., Levine M.D.: Facial pose from 3d data. Image. Vis. Comput. 24, 849–856 (2006)
Saffiotti, A., Broxvall, M.: PEIS ecologies: Ambient intelligence meets autonomous robotics. In: Proceedings of the International Conference on Smart Objects and Ambient Intelligence (sOc-EUSAI), pp. 275–280, Grenoble (2005)
Sherrah J., Gong S.: Fusion of perceptual cues for robust tracking of head posenext term and position. Pattern Recognit. 34, 1565–1572 (2001)
Srinivasan, S., Boyer, K.L.: Head pose estimation using view based eigenspaces. In: 16th International Conference on Pattern Recognition, pp. 302–305 (2002)
Zhang, H.J., Cheng, Q.S., Li, S.Z., Peng, X.H.: Multi-view face pose estimation based on supervised isa learning. In: IEEE International Conference Automatic Face and Gesture Recognition, pp. 100–105 (2002)
Weng R.C., Wu T., Lin C.: Probability estimates for multi-class classification by pairwise coupling. J. Mach. Learn. Res. 5, 975–1005 (2004)
Vapnik V.: Statistical Learning Theory. Wiley, New York (1998)
Voit, M., Nickel, K., Stiefelhagen, R.: Multi-view head pose estimation using neural networks. In: 2nd Canadian Conference on Computer and Robot Vision, pp. 347–352 (2005)
Voit M., Nickel K., Stiefelhagen R.: Head pose estimation in single- and multi-view environments—results on the clear’07 benchmarks. Lect. Notes Comput. Sci. 4625, 307–316 (2009)
Voit M., Stiefelhagen R.: A system for probabilistic joint 3d head tracking and pose estimation in low-resolution, multi-view environments. Lect. Notes Comput. Sci. 5815, 415–424 (2009)
Wang J.-G., Sung E.: Pose determination of human faces by using vanishing points. Pattern Recognit. 34, 2427–2445 (2001)
Wanga J.-G., Sungb E.: Em enhancement of 3d head posen estimated by point at infinity. Image Vis. Comput. 25, 1864–1874 (2007)
Wu J., Trivedia M.M.: A two-stage head pose estimation framework and evaluation. Pattern Recognit. 41, 1138–1158 (2008)
Fradet L., Wei, Y., Tan, T.: Head pose estimation using gabor eigenspace modeling. In: IEEE International Conference on image processing, pp. 281–284 (2002)
Zhang, Z., Hu, Y., Liu, M., Huang, T.: Head pose estimation in seminar room using multi view face detectors. In: CLEAR Evaluation and Workshop, pp. 281–290 (2006)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Muñoz-Salinas, R., Yeguas-Bolivar, E., Saffiotti, A. et al. Multi-camera head pose estimation. Machine Vision and Applications 23, 479–490 (2012). https://doi.org/10.1007/s00138-012-0410-z
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00138-012-0410-z