Abstract
We present an action recognition scheme that integrates multiple modality of cues that include shape, motion and depth to recognize human gesture in the video sequences. In the proposed approach we extend classification framework that is commonly used in 2D object recognition to 3D spatio-temporal space for recognizing actions. Specifically, a boosting-based classifier is used that learns spatio-temporal features specific to target actions where features are obtained from temporal patterns of shape contour, optical flow and depth changes occuring at local body parts. The individual features exhibit different strength and sensitivity depending on many factors that include action, underlying body parts and background. In the current method, the multiple cues of different modalities are combined optimally by fisher linear discriminant to form a strong feature that preserve strength of individual cues. In the experiment, we apply the integrated action classifier on a set of target actions and evaluate its performance by comparing with single cue-based cases and present qualitative analysis of performance gain.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Shechtman, E., Irani, M.: Space-time behavior based correlation. In: Proc. IEEE Conf. on Comp. Vision and Patt. Recog., Washington, DC, USA, pp. 405–412. IEEE Computer Society Press, Los Alamitos (2005)
Viola, P., Jones, M., Snow, D.: Detecting pedestrians using patterns of motion and appearance. International Conference on Computer Vision 02, 734 (2003)
Ke, Y., Sukthankar, R., Hebert, M.: Efficient visual event detection using volumetric features. In: International Conference on Computer Vision, Washington, DC, USA, pp. 166–173. IEEE Computer Society Press, Los Alamitos (2005)
Schuldt, C., Laptev, I., Caputo, B.: Recognizing human actions: A local svm approach. In: International Conference on Pattern Recognition, Washington, DC, USA, pp. 32–36. IEEE Computer Society Press, Los Alamitos (2004)
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: Proc. IEEE Conf. on Comp. Vision and Patt. Recog., Washington, DC, USA, pp. 886–893. IEEE Computer Society Press, Los Alamitos (2005)
Shet, V., Prasad, V., Elgammal, A., Yacoob, Y., Davis, L.: Multi-cue exemplar-based nonparametric model for gesture recognition. In: Indian Conference on Computer Vision, Graphics and Image Processing (2004)
Sidenbladh, H.: Probabilistic Tracking and Reconstruction of 3D Human Motion in Monocular Video Sequences. PhD Thesis TRITA-NA-0114, Dept. of Numerical Analysis and Computer Science, KTH, Sweden (2001) ISBN 91-7283-169-3
Giebel, J., Gavrila, D., Schnorr, C.: A bayesian framework for multi-cue 3d object tracking. In: Pajdla, T., Matas, J(G.) (eds.) ECCV 2004. LNCS, vol. 3021, pp. 241–252. Springer, Heidelberg (2004)
Paletta, L., Paar, G.: Bayesian decision fusion for dynamic multi-cue object detection. In: Indian Conference on Computer Vision, Graphics and Image Processing (2002)
Birchfield, S.: Elliptical head tracking using intensity gradients and color histograms. In: Proc. IEEE Conf. on Comp. Vision and Patt. Recog. (1998)
Spengler, M., Schiele, B.: Towards robust multi-cue integration for visual tracking. IEEE Trans. Pattern Anal. Machine Intell. 13(9), 891–906 (1991)
Shan, Y., Sawhney, H., Kumar, R.: Unsupervised learning of discriminative edge measures for vehicle matching between non-overlapping cameras. In: Proc. IEEE Conf. on Comp. Vision and Patt. Recog. (2005)
Viola, P., Jones, M.: Rapid object detection using a boosted cascade of simple features. In: Proc. IEEE Conf. on Comp. Vision and Patt. Recog. (2001)
Shan, Y., Han, F., Sawhney, H., Kumar, R.: Learning exemplar-based categorization for the detection of multi-view multi-pose objects. In: Proc. IEEE Conf. on Comp. Vision and Patt. Recog., Washington, DC, USA, pp. 1431–1438. IEEE Computer Society Press, Los Alamitos (2006)
Jung, S.H., Shan, Y., Sawhney, H., Aggarwal, M.: Action detection using approximated spatio-temporal adaboost. Technical Report, Sarnoff Corporation (2007)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Jung, SH., Guo, Y., Sawhney, H., Kumar, R. (2007). Multiple Cue Integrated Action Detection. In: Lew, M., Sebe, N., Huang, T.S., Bakker, E.M. (eds) Human–Computer Interaction. HCI 2007. Lecture Notes in Computer Science, vol 4796. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-75773-3_12
Download citation
DOI: https://doi.org/10.1007/978-3-540-75773-3_12
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-75772-6
Online ISBN: 978-3-540-75773-3
eBook Packages: Computer ScienceComputer Science (R0)