Skip to main content
Log in

Human behavior recognition based on 3D features and hidden markov models

  • Original Paper
  • Published:
Signal, Image and Video Processing Aims and scope Submit manuscript

    We’re sorry, something doesn't seem to be working properly.

    Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.

Abstract

Human vision system can receive the RGB and depth information at the same time and make an accurate judgment on human behaviors. However, in an ordinary camera, there is a loss in information when a 3D image is projected to a 2D plane. The depth and RGB information collected simultaneously by Kinect can provide more discriminant information for human behaviors than traditional cameras. Therefore, RGB-D camera is thought to be the key of solving human behavior recognition for a long time. In this paper, we develop 3D motion scale invariant feature transform for the description of the depth and motion information. It serves as a more effective descriptor for the RGB and depth videos. Hidden Markov Model is utilized for improving the accuracy of human behavior recognition. Experiments show that our framework provides richer information for discriminative point of behavior analysis and obtains better recognition performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  1. Turaga, P., Chellappa, R., Subrahmanian, V.S., Udrea Machine, O.: Recognition of human activities: a survey. IEEE Trans. Circuits Syst. Video Technol. 18(11), 1473–1488 (2008)

  2. Candamo, J., Shreve, M., Goldgof, D.B., Sapper, D.B., Kasturi, R.: Understanding transit scenes: a survey on human behavior-recognition algorithms. IEEE Trans. Intell. Transp. Syst. 11(1), 206–224 (2010)

    Article  Google Scholar 

  3. Popoola, O.P., Kejun, W.: Video-based abnormal human behavior recognition—a review. IEEE Trans. Syst. Man. Cybern. Part C: Appl. Rev. 42, 865–878 (2012)

    Article  Google Scholar 

  4. Ntalampiras, S., Arsic, D., Hofmann, M., Andersson, M., Ganchev, T.: PROMETHEUS: heterogeneous sensor database in support of research on human behavioral patterns in unrestricted environments. Signal Image Video Process. (8), 1211–1231 (2014)

  5. Lowe, D.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. (2), 91–110 (2004)

  6. Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. Int. Conf. Comput. Vis. Pattern Recognit. 886–893 (2005)

  7. Laptev, I.: On space-time interest points. Int. J. Comput. Vis. (2), 107–123 (2005)

  8. Tsai, D.M., Chiu, W.Y., Lee, M.H.: Optical flow-motion history image (OF-MHI) for action recognition. Signal Image Video Process. doi:10.1007/s11760-014-0677-9

  9. Chen, M., Hauptmann, A.: Mosift: recognizing human actions in surveillance videos. Comput. Sci. Dep. 929–936 (2009)

  10. Cozar, J.R., Gonzalez-Linares, J.M., Guil, N., Hernandez, R., Heredia, Y.: Visual words selection for human action classification. High Perform. Comput. Simul. (HPCS) 188–194 (2012)

  11. Gao, Y., Ji, R., Zhang, L., Hauptmann, A.: Symbiotic tracker ensemble towards a unified tracking framework. IEEE Trans. Circuits Syst. Video Technol. 24(7), 1122–1131 (2014)

    Article  Google Scholar 

  12. Gao, Z., Chen M., Detyniecki, M., Wu W., Hauptmann, A., Wactlar, H., Cai, A.: Multi-camera monitoring of infusion pump use. In: Semantic Computing (ICSC), pp. 105–111 (2010)

  13. Garcia-Martin, A. Hauptmann, A. Martinez, J.M.: People detection based on appearance and motion models. In: Advanced Video and Signal-Based Surveillance (AVSS), pp. 256–260 (2011)

  14. Liu, A.A.: Human action recognition with structured discriminative random fields. Electron. Lett. 651–653 (2011)

  15. Machida, E., Meifen, C., Murao, T., Hashimoto, H.: Human motion tracking of mobile robot with Kinect 3D sensor. In: SICE Annual Conference (SICE), pp. 2207–2211 (2012)

  16. Gao, Y., Wang, M., Ji, R., Wu, X., Dai, Q.: 3D Object retrieval with hausdorff distance learning. IEEE Trans. Ind. Electron. 61(4), 2088–2098 (2014)

    Article  Google Scholar 

  17. Han, J., Shao, L., Xu, D., Shotton, J.: Multi-object detection and behavior recognition from motion 3D data. In: Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 37–42 (2011)

  18. Cruz, L., Lucio, D., Velho, L.: Kincet and RGBD images: challenges and applications. In: Graphics Patterns Images Tutorials. (SIBGRAPI-T), pp. 36–49 (2012)

  19. Kyungnam, K., Cao, M., Rao, S., Jiejun, X., Medasani, S., Owechko, Y.: Enhanced computer vision with microsoft kinect sensor: a review. Cybernetics. 1 (2013)

  20. Jiang, F., Wu, S., Yang, G., Zhao, D., Kung, S.Y.: Viewpoint-independent hand gesture recognition with Kinect. Signal Image Video Process. 8(Suppl. 1), 163–172 (2014)

    Article  Google Scholar 

  21. Ming, Y., Ruan, Q., Hauptmann, A.G.: Activity recognition from RGB-D camera with 3D local spatio-temporal features. Multimed Expo (ICME). 344–349 (2012)

  22. Daniele, Maccagnola, Enza, Messina, Qian, Gao, David, Gilbert: A machine learning approach for generating temporal logic classifications of complex model behaviours. In: Simulation Conference. (WSC), pp. 1–12 (2012)

  23. Gao, Y., Wang, M., Zha, Z., Shen, J., Li, X., Wu, X.: Visual-textual joint relevance learning for tag-based social image search. IEEE Trans. Image Process. 22(1), 363–376 (2013)

    Article  MathSciNet  Google Scholar 

  24. Mukhopadhyay, S., Leung, H.: Recognizing human behavior through nonlinear dynamics and syntactic learning. In: IEEE International Conference on System, Man, and Cybernetics (SMC), Seoul, 14–17 Oct 2012. IEEE, pp. 846–850 (2012)

  25. Snoek, J., Hoey, J., Stewart, L., Zemel, R.S.: Automated detection of unusual events on stairs. Comput. Robot Vis. 5 (2006)

  26. Jiang, F., Wu, Y., Katsaggelos, A.K.: A dynamic hierarchical clustering method for trajectory-based unusual video event detection. Image Process. 907–913 (2009)

  27. Jie, Y., Yang, Q., Pan, J.J.: Sensor-based abnormal human-activity detection. Knowl. Data Eng. 1082–1090 (2008)

  28. Rao, S., Sastry, P.S.: Abnormal activity detection in video sequences using learnt probability densities. In: Conference Converge Technology Asia-Pacific Registration (TENCON 2003), pp. 369–372 (2003)

  29. Gao, Y., Wang, M., Tao, D., Ji, R., Dai, Q.: 3D Object retrieval and recognition with hypergraph analysis. IEEE Trans. Image Process. 21(9), 4290–4303 (2012)

    Article  MathSciNet  Google Scholar 

  30. Ping, T.Y., Xiao-Jun, W., Hai-feng, L.: Intelligent video analysis technology for elevator cage abnormality detection in computer vision. Comput. Sci. Converg. Inf. Technol. 1252–1258 (2009)

  31. Karaman, S., Benois-Pineau, J., Dovgalecs, V., Megret, Pinquier, R., André J., Obrecht, R., Yann, G., Dartigues, J.-F.: Hierarchical hidden markov model in detecting activities of daily living in wearable videos for studies of dementia. Multimed. Tools Appl. (2012)

  32. Rabiner, L.: A tutorial on hidden markov models and selected applications in speech recognition. Proc. IEEE 77(2), 257–286 (1989)

    Article  Google Scholar 

  33. Hu C.H., We, S.L.: An efficient method of human behavior recognition in smart environments. In: Computer Application and System Modeling (ICCASM), pp. V12–690-V12-693 (2010)

  34. Gao, Y., Tang, J., Hong, R., Yan, S., Dai, Q., Zhang, N., Chua, T.-S.: Camera constraint-free view-based 3D object retrieval. IEEE Trans. Image Process. 21(4), 2269–2281 (2012)

  35. Bilal, S., Akmeliawati, R., El Salami, M.J., Shafie, A.A.: Vision-based hand posture detection and recognition for sign language-a study. In: International Conference Mechatronics (ICOM), pp. 1–6 (2011)

  36. Suarez, J., Murphy, R.R.: Hand gesture recognition with depth images: a review. In: RO-MAN 2012 IEEE, pp. 411–417 (2012)

  37. Fitzgibbon, S.J., Cook, A., Sharp, M., Finocchio, T., Moore, M., Kipman, R., Blake, A.: Real-time human pose recognition in parts from single depth images. Comput. Vis. Pattern Recognit. 1297–1304 (2011)

  38. Jalal, A., Lee, S., Kim, J.T., Kim, T.S.: Human activity recognition via the features of labeled depth body parts. In: 10th International Conference on Smart Homes and Health Telematics. (ICOST), pp. 246–249 (2012)

  39. Zhao, Y., Liu, Z., Yang, L., Cheng, H.: Combing RGB and depth map features for human activity recognition. In: Signal and Information Processing Association Annual Summit Conference (APSIPA ASC), pp. 1–4 (2012)

Download references

Acknowledgments

The work presented in this paper was supported by the National Natural Science Foundation of China (Grants Nos. NSFC-61170176, NSFC-61402046), Fund for the Doctoral Program of Higher Education of China (Grants No. 20120005110002), President Funding of Beijing University of Posts and Telecommunications.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Liujuan Cao.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wu, Y., Jia, Z., Ming, Y. et al. Human behavior recognition based on 3D features and hidden markov models. SIViP 10, 495–502 (2016). https://doi.org/10.1007/s11760-015-0756-6

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11760-015-0756-6

Keywords

Navigation