Abstract
This paper presents a novel approach to the problem of determining head pose estimation and face 3D orientation of several people in low resolution sequences from multiple calibrated cameras. Spatial redundancy is exploited and the head in the scene is approximated by an ellipsoid. Skin patches from each detected head are located in each camera view. Data fusion is performed by back-projecting skin patches from single images onto the estimated 3D head model, thus providing a synthetic reconstruction of the head appearance. A particle filter is employed to perform the estimation of the head pan angle of the person under study. A likelihood function based on the face appearance is introduced. Experimental results proving the effectiveness of the proposed algorithm are provided for the SmartRoom scenario of the CLEAR Evaluation 2007 Head Orientation dataset.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Arulampalam, M.S., Maskell, S., Gordon, N., Clapp, T.: A tutorial on particle filters for online nonlinear/non-Gaussian Bayesian tracking. IEEE Trans. on Signal Processing 50(2), 174–188 (2002)
Ballard, P., Stockman, G.C.: Controlling a computer via facial aspect. IEEE Trans. on Systems, Man and Cybernetics 25(4), 669–677 (1995)
Black, M., Brard, F., Jepson, A., Newman, W., Saund, W., Socher, G., Taylor, M.: The Digital Office: Overview. In: Proc. Spring Symposium on Intelligent Environments, vol. 72, pp. 98–102 (1998)
Canton-Ferrer, C., Casas, J.R., Pardàs, M.: Fusion of Multiple Viewpoint Information Towards 3D Face Robust Orientation Detection. In: Proc. IEEE Int. Conf. on Image Processing, vol. 2, pp. 366–369 (2005)
Canton-Ferrer, C., Segura, C., Casas, J.R., Pardàs, M., Hernando, J.: Audiovisual Head Orientation Estimation with Particle Filters in Multisensor Scenarios. EURASIP Journal on Advances in Signal Processing 1(32) (2008)
Chen, M., Hauptmann, A.: Towards Robust Face Recognition from Multiple Views. In: Proc. IEEE Int. Conf. on Multimedia and Expo. (2004)
IP CHIL-Computers in the Human Interaction Loop, http://chil.server.de
Chiu, P., Kapuskar, A., Reitmeier, S., Wilcox, L.: Room with a rear view: Meeting capture in a multimedia conference room. IEEE Multimedia Magazine 7(4), 48–54 (2000)
CLEAR Evaluation Workshop (2006)
Deutscher, J., Reid, I.: Articulated Body Motion Capture by Stochastic Search. Int. Journal of Computer Vision, 61(2), 185–205 (2005)
Gatica-Perez, D., Lathoud, G., Odobez, J.-M., McCowan, I.: Audiovisual Probabilistic Tracking of Multiple Speakers in Meetings. IEEE Trans. on Audio, Speech and Language Processing 15(2), 601–616 (2007)
Gordon, N.J., Salmond, D.J., Smith, A.F.M.: Novel approach to nonlinear/non-Gaussian Bayesian state estimation. IEE Proc. on Radar and Signal Processing 140(2), 107–113 (1993)
Hartley, R.I., Zisserman, A.: Multiple View Geometry in Computer Vision, 2nd edn. Cambridge University Press, Cambridge (2004)
Horprasert, T., Yacoob, Y., Davis, L.S.: Computing 3-D head orientation from a monocular image sequence. In: Proc. Int. Conf. on Automatic Face and Gesture Recognitio, pp. 242–247 (1996)
Isard, M., Blake, A.: CONDENSATION–Conditional Density Propagation for Visual Tracking. Int. Journal of Computer Vision 29(1), 5–28 (1998)
Jones, M., Rehg, J.: Statistical Color Models with Application to Skin Detection. Int. Journal of Computer Vision 46(1), 81–96 (2002)
Matsumoto, Y., Zelinsky, A.: An algorithm for real-time stereo vision implementation of head pose and gaze direction measurement. In: Proc. IEEE Int. Conf. on Automatic Face and Gesture Recognition, pp. 499–504 (2000)
Nickel, K., Gehrig, T., Stiefelhagen, R., McDonough, J.: A joint particle filter for audio-visual speaker tracking. In: Proc. IEEE Int. Conf. on Multimodal Interfaces,, pp. 61–68 (2005)
Rae, R., Ritter, H.J.: Recognition of Human Head Orientation Based on Artificial Neural Networks. IEEE Tran. on Neural Networks 9, 257–265 (1998)
Voit, M., Nickel, K., Stiefelhagen, R.: Neural Network-based Head Pose Estimation and Multi-view Fusion. In: Stiefelhagen, R., Garofolo, J.S. (eds.) CLEAR 2006. LNCS, vol. 4122, pp. 299–304. Springer, Heidelberg (2007)
Wang, C., Brandstein, M.: Robust head pose estimation by machine learning. In: Proc. IEEE Int. Conf. on Image Processing, vol. 3, pp. 210–213 (2000)
Wang, C., Griebel, S., Brandstein, M.: Robust automatic video-conferencing with multiple cameras and microphones. In: Proc. IEEE Int. Conf. on Multimedia and Expo., vol. 3, pp. 1585–1588 (2000)
West, M., Harrison, J.: Bayesian forecasting and dynamic models, 2nd edn. Springer, New York (1997)
Zhang, Z., Hu, Y., Liu, M., Huang, T.: Head Pose Estimation in Seminar Rooms Using Multi View Face Detectors. In: Stiefelhagen, R., Garofolo, J.S. (eds.) CLEAR 2006. LNCS, vol. 4122, pp. 299–304. Springer, Heidelberg (2007)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Canton-Ferrer, C., Casas, J.R., Pardàs, M. (2008). Head Orientation Estimation Using Particle Filtering in Multiview Scenarios. In: Stiefelhagen, R., Bowers, R., Fiscus, J. (eds) Multimodal Technologies for Perception of Humans. RT CLEAR 2007 2007. Lecture Notes in Computer Science, vol 4625. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-68585-2_30
Download citation
DOI: https://doi.org/10.1007/978-3-540-68585-2_30
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-68584-5
Online ISBN: 978-3-540-68585-2
eBook Packages: Computer ScienceComputer Science (R0)