Abstract
In this paper, a stereo camera-based novel approach for Human Activity Recognition (HAR) is presented using robust 3-D human body joint features and joint-specific Hidden Markov Models (HMMs). At first, body joint angles are estimated by co-registering a 3-D body model to the stereo video information (i.e., 3-D depth) of a human posture acquired by a stereo camera. Conventionally, all joint angles are augmented followed by discriminant feature extraction from them and a HMM is modeled for each activity. Although the traditional approach is straight forward and easy to implement but dependent to unnecessary joint features which are not even used in the activity. In this study, we focus on individual 3-D body joints rather than all joints together and body joint motion information in next frame is also considered in addition to the degree of freedom values (i.e., joint angles in current frame) of a joint. We propose a new way of modeling human activities and derive joint-specific HMMs. Based on motion information of the joints in next frame and degree of freedom information of body joints in the time-sequential distinguished activity video frames, the different activity classes are determined first. Each joint features are then mapped into codewords to generate a sequence of discrete symbols for joint-specific HMM. Then, joint-specific HMMs are trained according to their use in different activities. For testing, after determining the activity class based on the time-sequential body joint features, the discrete symbol sequence from each joint is applied to the trained joint-specific HMMs of the activities from that class only. Thus, for all body joints, the likelihoods of all activities are obtained by applying all body joint features and then, likelihoods for corresponding activities are summed up. Finally, one activity has been chosen with the highest likelihood from the summed likelihoods. Using joint-specific HMMs (i.e., multiple HMMs for an activity based on active body joints), superior recognition performance is obtained than the augmented joint angle feature-based single HMM for an activity as well as the traditional silhouette-based approaches.
Similar content being viewed by others
References
Carlsson S, Sullivan J (2002) Action recognition by shape matching to key frames. In IEEE Computer Society Workshop on Models versus Exemplars in Computer Vision, pp 263–270
Caschera MC, Ferri F, Grifoni P (2013) InteSe: an integrated model for resolving ambiguities in multimodal sentences. IEEE Trans Syst, Man, Cybern: Syst 43(4):911–931
Cech J, Sara R (2007) Efficient sampling of disparity space for fast and accurate matching. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp 1–8
Gupta A, Mittal A, Davis LS (2008) Constraint integration for efficient multiview pose estimation with self-occlusions. IEEE Trans Pattern Anal Mach Intell 30(3):493–506
Isard M (2003) PAMPAS: real-valued graphical models for computer vision. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 1: 613–620
Jalal A, Sharif N, Kim JT, Kim T-S (2013) Human activity recognition via recognized body parts of human depth silhouettes for residents monitoring services at smart home. Indoor Built Envir
Jelinek F (1992) Robust part-of-speech tagging using a Hidden Markov Model. Comput Speech Lang 6(3):225–242
Knossow D, Ronfard R, Horaud R (2008) Human motion tracking with a kinematic parameterization of extremal contours. Int J Comput Vis 79(3):247–269
Lee MW, Cohen I (2006) A model-based approach for estimating human 3D poses in static images. IEEE Trans Pattern Anal Mach Intell 28(6):905–916
Makhoul J, Starner T, Schwartz R, Chou G (1994) On-line cursive handwriting recognition using Hidden Markov Models and statistical grammars, in Proc. Workshop Hum. Lang. Technol., Plainsboro, NJ, Mar. 8–11, pp. 432–436
Moeslund T (2001) B and Granum E: a survey of computer vision-based human motion capture. Comput Vis Image Underst 81:231–268
Niu F, Abdel-Mottaleb M (2004) View-invariant human activity recognition based on shape and motion features. In Proceedings of the IEEE Sixth International Symposium on Multimedia Software Engineering, pp 546–556
Rabiner LR (1989) A tutorial on Hidden Markov Models and selected applications in speech recognition. Proc IEEE 77(2):257–285
Sudderth E, Ihler A, Freeman W, Willsky A (2003) Nonparametric belief propagation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 1: 605–612
Taylor CJ (2000) Reconstruction of articulated objects from point correspondences in a single uncalibrated image. Comput Vis Image Underst 80(3):349–363
Uddin MZ, Kim T-S (2011) Continuous Hidden Markov Models for depth map-based human activity recognition, Hidden Markov Models. Theor Appl
Uddin MZ, Lee JJ, Kim T-S (2008) Shape-based human activity recognition using independent component analysis and Hidden Markov Model. In Proceedings of the 21st International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems, pp 245–254
Uddin MZ, Truc PTH, Lee JJ, Kim T-S (2008) Human activity recognition using independent component features from depth images. In Proceedings of the 5th International Conference on Ubiquitous Healthcare, pp 181–183
Uddin MZ, Thang ND, Kim T-S, Kim JT (2011) Human activity recognition using body joint angle features and Hidden Markov Model. TRI J. pp. 569, 579
Uddin MZ, Kim D-H, Kim JT, Kim T-S (2013) An indoor human activity recognition system for smart home using local binary pattern features with Hidden Markov Models. Indoor Built Environ. doi:10.1177/1420326X12469734
Yamato J, Ohya J, Ishii K (1992) Recognizing human action in time-sequential images using Hidden Markov Model. In Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition, pp 379–385
Acknowledgments
This paper was supported by Faculty Research Fund, Sungkyunkwan University, 2013.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Uddin, M.Z., Kim, TS. 3-D body joint-specific HMM-based approach for human activity recognition from stereo posture image sequence. Multimed Tools Appl 74, 11207–11222 (2015). https://doi.org/10.1007/s11042-014-2225-6
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-014-2225-6