Abstract
Human vision system can receive the RGB and depth information at the same time and make an accurate judgment on human behaviors. However, in an ordinary camera, there is a loss in information when a 3D image is projected to a 2D plane. The depth and RGB information collected simultaneously by Kinect can provide more discriminant information for human behaviors than traditional cameras. Therefore, RGB-D camera is thought to be the key of solving human behavior recognition for a long time. In this paper, we develop 3D motion scale invariant feature transform for the description of the depth and motion information. It serves as a more effective descriptor for the RGB and depth videos. Hidden Markov Model is utilized for improving the accuracy of human behavior recognition. Experiments show that our framework provides richer information for discriminative point of behavior analysis and obtains better recognition performance.
Similar content being viewed by others
References
Turaga, P., Chellappa, R., Subrahmanian, V.S., Udrea Machine, O.: Recognition of human activities: a survey. IEEE Trans. Circuits Syst. Video Technol. 18(11), 1473–1488 (2008)
Candamo, J., Shreve, M., Goldgof, D.B., Sapper, D.B., Kasturi, R.: Understanding transit scenes: a survey on human behavior-recognition algorithms. IEEE Trans. Intell. Transp. Syst. 11(1), 206–224 (2010)
Popoola, O.P., Kejun, W.: Video-based abnormal human behavior recognition—a review. IEEE Trans. Syst. Man. Cybern. Part C: Appl. Rev. 42, 865–878 (2012)
Ntalampiras, S., Arsic, D., Hofmann, M., Andersson, M., Ganchev, T.: PROMETHEUS: heterogeneous sensor database in support of research on human behavioral patterns in unrestricted environments. Signal Image Video Process. (8), 1211–1231 (2014)
Lowe, D.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. (2), 91–110 (2004)
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. Int. Conf. Comput. Vis. Pattern Recognit. 886–893 (2005)
Laptev, I.: On space-time interest points. Int. J. Comput. Vis. (2), 107–123 (2005)
Tsai, D.M., Chiu, W.Y., Lee, M.H.: Optical flow-motion history image (OF-MHI) for action recognition. Signal Image Video Process. doi:10.1007/s11760-014-0677-9
Chen, M., Hauptmann, A.: Mosift: recognizing human actions in surveillance videos. Comput. Sci. Dep. 929–936 (2009)
Cozar, J.R., Gonzalez-Linares, J.M., Guil, N., Hernandez, R., Heredia, Y.: Visual words selection for human action classification. High Perform. Comput. Simul. (HPCS) 188–194 (2012)
Gao, Y., Ji, R., Zhang, L., Hauptmann, A.: Symbiotic tracker ensemble towards a unified tracking framework. IEEE Trans. Circuits Syst. Video Technol. 24(7), 1122–1131 (2014)
Gao, Z., Chen M., Detyniecki, M., Wu W., Hauptmann, A., Wactlar, H., Cai, A.: Multi-camera monitoring of infusion pump use. In: Semantic Computing (ICSC), pp. 105–111 (2010)
Garcia-Martin, A. Hauptmann, A. Martinez, J.M.: People detection based on appearance and motion models. In: Advanced Video and Signal-Based Surveillance (AVSS), pp. 256–260 (2011)
Liu, A.A.: Human action recognition with structured discriminative random fields. Electron. Lett. 651–653 (2011)
Machida, E., Meifen, C., Murao, T., Hashimoto, H.: Human motion tracking of mobile robot with Kinect 3D sensor. In: SICE Annual Conference (SICE), pp. 2207–2211 (2012)
Gao, Y., Wang, M., Ji, R., Wu, X., Dai, Q.: 3D Object retrieval with hausdorff distance learning. IEEE Trans. Ind. Electron. 61(4), 2088–2098 (2014)
Han, J., Shao, L., Xu, D., Shotton, J.: Multi-object detection and behavior recognition from motion 3D data. In: Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 37–42 (2011)
Cruz, L., Lucio, D., Velho, L.: Kincet and RGBD images: challenges and applications. In: Graphics Patterns Images Tutorials. (SIBGRAPI-T), pp. 36–49 (2012)
Kyungnam, K., Cao, M., Rao, S., Jiejun, X., Medasani, S., Owechko, Y.: Enhanced computer vision with microsoft kinect sensor: a review. Cybernetics. 1 (2013)
Jiang, F., Wu, S., Yang, G., Zhao, D., Kung, S.Y.: Viewpoint-independent hand gesture recognition with Kinect. Signal Image Video Process. 8(Suppl. 1), 163–172 (2014)
Ming, Y., Ruan, Q., Hauptmann, A.G.: Activity recognition from RGB-D camera with 3D local spatio-temporal features. Multimed Expo (ICME). 344–349 (2012)
Daniele, Maccagnola, Enza, Messina, Qian, Gao, David, Gilbert: A machine learning approach for generating temporal logic classifications of complex model behaviours. In: Simulation Conference. (WSC), pp. 1–12 (2012)
Gao, Y., Wang, M., Zha, Z., Shen, J., Li, X., Wu, X.: Visual-textual joint relevance learning for tag-based social image search. IEEE Trans. Image Process. 22(1), 363–376 (2013)
Mukhopadhyay, S., Leung, H.: Recognizing human behavior through nonlinear dynamics and syntactic learning. In: IEEE International Conference on System, Man, and Cybernetics (SMC), Seoul, 14–17 Oct 2012. IEEE, pp. 846–850 (2012)
Snoek, J., Hoey, J., Stewart, L., Zemel, R.S.: Automated detection of unusual events on stairs. Comput. Robot Vis. 5 (2006)
Jiang, F., Wu, Y., Katsaggelos, A.K.: A dynamic hierarchical clustering method for trajectory-based unusual video event detection. Image Process. 907–913 (2009)
Jie, Y., Yang, Q., Pan, J.J.: Sensor-based abnormal human-activity detection. Knowl. Data Eng. 1082–1090 (2008)
Rao, S., Sastry, P.S.: Abnormal activity detection in video sequences using learnt probability densities. In: Conference Converge Technology Asia-Pacific Registration (TENCON 2003), pp. 369–372 (2003)
Gao, Y., Wang, M., Tao, D., Ji, R., Dai, Q.: 3D Object retrieval and recognition with hypergraph analysis. IEEE Trans. Image Process. 21(9), 4290–4303 (2012)
Ping, T.Y., Xiao-Jun, W., Hai-feng, L.: Intelligent video analysis technology for elevator cage abnormality detection in computer vision. Comput. Sci. Converg. Inf. Technol. 1252–1258 (2009)
Karaman, S., Benois-Pineau, J., Dovgalecs, V., Megret, Pinquier, R., André J., Obrecht, R., Yann, G., Dartigues, J.-F.: Hierarchical hidden markov model in detecting activities of daily living in wearable videos for studies of dementia. Multimed. Tools Appl. (2012)
Rabiner, L.: A tutorial on hidden markov models and selected applications in speech recognition. Proc. IEEE 77(2), 257–286 (1989)
Hu C.H., We, S.L.: An efficient method of human behavior recognition in smart environments. In: Computer Application and System Modeling (ICCASM), pp. V12–690-V12-693 (2010)
Gao, Y., Tang, J., Hong, R., Yan, S., Dai, Q., Zhang, N., Chua, T.-S.: Camera constraint-free view-based 3D object retrieval. IEEE Trans. Image Process. 21(4), 2269–2281 (2012)
Bilal, S., Akmeliawati, R., El Salami, M.J., Shafie, A.A.: Vision-based hand posture detection and recognition for sign language-a study. In: International Conference Mechatronics (ICOM), pp. 1–6 (2011)
Suarez, J., Murphy, R.R.: Hand gesture recognition with depth images: a review. In: RO-MAN 2012 IEEE, pp. 411–417 (2012)
Fitzgibbon, S.J., Cook, A., Sharp, M., Finocchio, T., Moore, M., Kipman, R., Blake, A.: Real-time human pose recognition in parts from single depth images. Comput. Vis. Pattern Recognit. 1297–1304 (2011)
Jalal, A., Lee, S., Kim, J.T., Kim, T.S.: Human activity recognition via the features of labeled depth body parts. In: 10th International Conference on Smart Homes and Health Telematics. (ICOST), pp. 246–249 (2012)
Zhao, Y., Liu, Z., Yang, L., Cheng, H.: Combing RGB and depth map features for human activity recognition. In: Signal and Information Processing Association Annual Summit Conference (APSIPA ASC), pp. 1–4 (2012)
Acknowledgments
The work presented in this paper was supported by the National Natural Science Foundation of China (Grants Nos. NSFC-61170176, NSFC-61402046), Fund for the Doctoral Program of Higher Education of China (Grants No. 20120005110002), President Funding of Beijing University of Posts and Telecommunications.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Wu, Y., Jia, Z., Ming, Y. et al. Human behavior recognition based on 3D features and hidden markov models. SIViP 10, 495–502 (2016). https://doi.org/10.1007/s11760-015-0756-6
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11760-015-0756-6