Skip to main content

Deep Neural Networks for Human Behavior Understanding

  • Chapter
  • First Online:
Handbook of Multimedia Information Security: Techniques and Applications
  • 1342 Accesses

Abstract

Human behavior understanding techniques are proposed for several applications likewise object recognition, face detection, emotion detection, action detection, finger print identification, gait recognition, voice recognition, etc. Emotion and action recognition are the most popular applications among them. This chapter presents an analysis of recently developed deep learning techniques for emotion and activity recognition. Existing approaches are discussed that use deep learning as their core component. Experimental results are reported on benchmark datasets i.e. CK+ and SFEW datasets for emotion recognition, and Skoda and UCF 101 datasets for activity recognition. Experimentation shows that deep learning methods outperform other existing techniques in literature and demonstrate great performance.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 219.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 279.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 279.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Russell, S. J., & Norvig, P. (2016). Artificial intelligence: a modern approach. Malaysia; Pearson Education Limited.

    Google Scholar 

  2. Sonka, M., Hlavac, V., & Boyle, R. (2014). Image processing, analysis, and machine vision. Cengage Learning.

    Google Scholar 

  3. Nigam, S., Singh, R., & Misra, A. K. (2019). Towards intelligent human behavior detection for video surveillance. In Censorship, Surveillance, and Privacy: Concepts, Methodologies, Tools, and Applications (pp. 884-917). IGI Global.

    Google Scholar 

  4. Nigam, S., Singh, R., & Misra, A. K. (2018). A Review of Computational Approaches for Human Behavior Detection. Archives of Computational Methods in Engineering, 1-33. https://doi.org/10.1007/s11831-018-9270-7.

  5. Zhao, K., Chu, W. S., De la Torre, F., Cohn, J. F., & Zhang, H. (2016). Joint patch and multi-label learning for facial action unit and holistic expression recognition. IEEE Transactions on Image Processing, 25(8), 3931-3946.

    Google Scholar 

  6. Nigam, S., Singh, R., & Misra, A. K. (2018). Efficient facial expression recognition using histogram of oriented gradients in wavelet domain. Multimedia Tools and Applications, 1-23.

    Google Scholar 

  7. Emambakhsh, M., & Evans, A. (2017). Nasal patches and curves for expression-robust 3D face recognition. IEEE transactions on pattern analysis and machine intelligence, 39(5), 995-1007.

    Google Scholar 

  8. Nigam, S., Singh, R., & Misra, A. K. (2018). Local Binary Patterns based Facial Expression Recognition for Efficient Smart Applications, Machine Learning Paradigms: Theory and Applications, Security in Smart Cities, Studies in Computational Intelligence Series, Springer.

    Google Scholar 

  9. Kerola, T., Inoue, N., & Shinoda, K. (2017). Cross-view human action recognition from depth maps using spectral graph sequences. Computer Vision and Image Understanding, 154, 108-126.

    Google Scholar 

  10. Nigam, S., & Khare, A. (2016). Integration of moment invariants and uniform local binary patterns for human activity recognition in video sequences. Multimedia Tools and Applications, 75(24), 17303-17332.

    Google Scholar 

  11. Sharma, C. M., Kushwaha, A. K. S., Nigam, S., & Khare, A. (2011, September). On human activity recognition in video sequences. In Computer and Communication Technology (ICCCT), 2011 2nd International Conference on (pp. 152-158). IEEE.

    Google Scholar 

  12. Salah, A. A., Gevers, T., Sebe, N., & Vinciarelli, A. (2010, August). Challenges of human behavior understanding. In International Workshop on Human Behavior Understanding (pp. 1-12). Springer, Berlin, Heidelberg.

    Google Scholar 

  13. Kamnitsas, K., Ledig, C., Newcombe, V. F., Simpson, J. P., Kane, A. D., Menon, D. K., … & Glocker, B. (2017). Efficient multi-scale 3D CNN with fully connected CRF for accurate brain lesion segmentation. Medical image analysis, 36, 61-78.

    Google Scholar 

  14. Hoo-Chang, S., Roth, H. R., Gao, M., Lu, L., Xu, Z., Nogues, I., … & Summers, R. M. (2016). Deep convolutional neural networks for computer-aided detection: CNN architectures, dataset characteristics and transfer learning. IEEE transactions on medical imaging, 35(5), 1285.

    Google Scholar 

  15. Young, T., Hazarika, D., Poria, S., & Cambria, E. (2018). Recent trends in deep learning based natural language processing. IEEE Computational Intelligence Magazine, 13(3), 55-75.

    Google Scholar 

  16. Wang, J., Yang, Y., Mao, J., Huang, Z., Huang, C., & Xu, W. (2016). Cnn-rnn: A unified framework for multi-label image classification. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 2285-2294).

    Google Scholar 

  17. Zheng, Z., Zheng, L., & Yang, Y. (2017). A discriminatively learned cnn embedding for person reidentification. ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM), 14(1), 13.

    Google Scholar 

  18. Hafemann, L. G., Sabourin, R., & Oliveira, L. S. (2016, July). Writer-independent feature learning for offline signature verification using deep convolutional neural networks. In Neural networks (IJCNN), 2016 international joint conference on (pp. 2576-2583). IEEE.

    Google Scholar 

  19. Leal-Taixé, L., Canton-Ferrer, C., & Schindler, K. (2016). Learning by tracking: Siamese cnn for robust target association. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops (pp. 33-40).

    Google Scholar 

  20. Shima, Y., & Omori, Y. (2018, August). Image Augmentation for Classifying Facial Expression Images by Using Deep Neural Network Pre-trained with Object Image Database. In Proceedings of the 3rd International Conference on Robotics, Control and Automation (pp. 140-146). ACM.

    Google Scholar 

  21. Ronao, C. A., & Cho, S. B. (2016). Human activity recognition with smartphone sensors using deep learning neural networks. Expert Systems with Applications, 59, 235-244.

    Google Scholar 

  22. Goodfellow, I., Bengio, Y., Courville, A., & Bengio, Y. (2016). Deep learning (Vol. 1). Cambridge: MIT press.

    Google Scholar 

  23. Litjens, G., Kooi, T., Bejnordi, B. E., Setio, A. A. A., Ciompi, F., Ghafoorian, M., … & Sánchez, C. I. (2017). A survey on deep learning in medical image analysis. Medical image analysis, 42, 60-88.

    Google Scholar 

  24. Zeng, Z., Li, Z., Cheng, D., Zhang, H., Zhan, K., & Yang, Y. (2018). Two-Stream Multirate Recurrent Neural Network for Video-Based Pedestrian Reidentification. IEEE Transactions on Industrial Informatics, 14(7), 3179-3186.

    Google Scholar 

  25. Aldwairi, T., Perera, D., & Novotny, M. A. (2018). An evaluation of the performance of Restricted Boltzmann Machines as a model for anomaly network intrusion detection. Computer Networks, 144, 111-119.

    Google Scholar 

  26. Sankaran, A., Vatsa, M., Singh, R., & Majumdar, A. (2017). Group sparse autoencoder. Image and Vision Computing, 60, 64-74.

    Google Scholar 

  27. Dailey, M. N., Joyce, C., Lyons, M. J., Kamachi, M., Ishi, H., Gyoba, J., & Cottrell, G. W. (2010). Evidence and a computational explanation of cultural differences in facial expression recognition. Emotion, 10(6), 874.

    Google Scholar 

  28. Kanade, T., Cohn, J. F., & Tian, Y. (2000). Comprehensive database for facial expression analysis. In Automatic Face and Gesture Recognition. Proceedings. Fourth IEEE International Conference on (pp. 46-53). IEEE.

    Google Scholar 

  29. Lucey, P., Cohn, J. F., Kanade, T., Saragih, J., Ambadar, Z., & Matthews, I. (2010, June). The extended cohn-kanade dataset (ck+): A complete dataset for action unit and emotion-specified expression. In Computer Vision and Pattern Recognition Workshops (CVPRW), 2010 IEEE Computer Society Conference on (pp. 94-101). IEEE.

    Google Scholar 

  30. Yale facial expression database, http://vision.ucsd.edu/content/yale-face-database.

  31. Pantic, M., Valstar, M., Rademaker, R., & Maat, L. (2005, July). Web-based database for facial expression analysis. In 2005 IEEE international conference on multimedia and Expo (p. 5). IEEE.

    Google Scholar 

  32. Liu, M., Li, S., Shan, S., Wang, R., & Chen, X. (2014, November). Deeply learning deformable facial action parts model for dynamic expression analysis. In Asian conference on computer vision (pp. 143-157). Springer, Cham.

    Google Scholar 

  33. Jung, H., Lee, S., Yim, J., Park, S., & Kim, J. (2015). Joint fine-tuning in deep neural networks for facial expression recognition. In Proceedings of the IEEE International Conference on Computer Vision (pp. 2983-2991).

    Google Scholar 

  34. Jung, H., Lee, S., Park, S., Kim, B., Kim, J., Lee, I., & Ahn, C. (2015, January). Development of deep learning-based facial expression recognition system. In Frontiers of Computer Vision (FCV), 2015 21st Korea-Japan Joint Workshop on (pp. 1-4). IEEE.

    Google Scholar 

  35. Spiers, D. L. (2016). Facial emotion detection using deep learning. Doctoral Dissertation, UPPSALA Universitet.

    Google Scholar 

  36. Meng, Z., Liu, P., Cai, J., Han, S., & Tong, Y. (2017, May). Identity-aware convolutional neural network for facial expression recognition. In Automatic Face & Gesture Recognition (FG 2017), 2017 12th IEEE International Conference on (pp. 558-565). IEEE.

    Google Scholar 

  37. Liu, M., Li, S., Shan, S., & Chen, X. (2015). Au-inspired deep networks for facial expression feature learning. Neurocomputing, 159, 126-136.

    Google Scholar 

  38. Liu, P., Han, S., Meng, Z., & Tong, Y. (2014). Facial expression recognition via a boosted deep belief network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 1805-1812).

    Google Scholar 

  39. Fathallah, A., Abdi, L., & Douik, A. (2017, October). Facial Expression Recognition via Deep Learning. In Computer Systems and Applications (AICCSA), 2017 IEEE/ACS 14th International Conference on (pp. 745-750). IEEE.

    Google Scholar 

  40. Li, W., Li, M., Su, Z., & Zhu, Z. (2015, May). A deep-learning approach to facial expression recognition with candid images. In Machine Vision Applications (MVA), 2015 14th IAPR International Conference on (pp. 279-282). IEEE.

    Google Scholar 

  41. Dhall, A., Goecke, R., Lucey, S., & Gedeon, T. (2011, November). Static facial expression analysis in tough conditions: Data, evaluation protocol and benchmark. In Computer Vision Workshops (ICCV Workshops), 2011 IEEE International Conference on (pp. 2106-2112). IEEE.

    Google Scholar 

  42. Levi, G., & Hassner, T. (2015, November). Emotion recognition in the wild via convolutional neural networks and mapped binary patterns. In Proceedings of the 2015 ACM on international conference on multimodal interaction (pp. 503-510). ACM.

    Google Scholar 

  43. Ng, H. W., Nguyen, V. D., Vonikakis, V., & Winkler, S. (2015, November). Deep learning for emotion recognition on small datasets using transfer learning. In Proceedings of the 2015 ACM on international conference on multimodal interaction (pp. 443-449). ACM.

    Google Scholar 

  44. Li, S., & Deng, W. (2018). Reliable Crowdsourcing and Deep Locality-Preserving Learning for Unconstrained Facial Expression Recognition. IEEE Transactions on Image Processing.

    Google Scholar 

  45. Ding, H., Zhou, S. K., & Chellappa, R. (2017, May). Facenet2expnet: Regularizing a deep face recognition net for expression recognition. In Automatic Face & Gesture Recognition (FG 2017), 2017 12th IEEE International Conference on (pp. 118-126). IEEE.

    Google Scholar 

  46. Pons, G., & Masip, D. (2018). Multi-task, multi-label and multi-domain learning with residual convolutional networks for emotion recognition. arXiv preprint arXiv:1802.06664.

    Google Scholar 

  47. Liu, X., Kumar, B. V., You, J., & Jia, P. (2017, July). Adaptive Deep Metric Learning for Identity-Aware Facial Expression Recognition. In CVPR Workshops (pp. 522-531).

    Google Scholar 

  48. Cai, J., Meng, Z., Khan, A. S., Li, Z., O’Reilly, J., & Tong, Y. (2018, May). Island Loss for Learning Discriminative Features in Facial Expression Recognition. In Automatic Face & Gesture Recognition (FG 2018), 2018 13th IEEE International Conference on (pp. 302-309). IEEE.

    Google Scholar 

  49. Kim, B. K., Lee, H., Roh, J., & Lee, S. Y. (2015, November). Hierarchical committee of deep cnns with exponentially-weighted decision fusion for static facial expression recognition. In Proceedings of the 2015 ACM on International Conference on Multimodal Interaction (pp. 427-434). ACM.

    Google Scholar 

  50. Yu, Z., & Zhang, C. (2015, November). Image based static facial expression recognition with multiple deep network learning. In Proceedings of the 2015 ACM on International Conference on Multimodal Interaction (pp. 435-442). ACM.

    Google Scholar 

  51. Roggen, D., Calatroni, A., Rossi, M., Holleczek, T., Förster, K., Tröster, G., … & Doppler, J. (2010, June). Collecting complex activity datasets in highly rich networked sensor environments. In Networked Sensing Systems (INSS), 2010 Seventh International Conference on (pp. 233-240). IEEE.

    Google Scholar 

  52. Reiss, A., & Stricker, D. (2012, June). Introducing a new benchmarked dataset for activity monitoring. In Wearable Computers (ISWC), 2012 16th International Symposium on (pp. 108-109). IEEE.

    Google Scholar 

  53. Zappi, P., Lombriser, C., Stiefmeier, T., Farella, E., Roggen, D., Benini, L., & Tröster, G. (2008). Activity recognition from on-body sensors: accuracy-power trade-off by dynamic sensor selection. In Wireless sensor networks (pp. 17-33). Springer, Berlin, Heidelberg.

    Google Scholar 

  54. Banos, O., Garcia, R., Holgado-Terriza, J. A., Damas, M., Pomares, H., Rojas, I., … & Villalonga, C. (2014, December). mHealthDroid: a novel framework for agile development of mobile health applications. In International Workshop on Ambient Assisted Living (pp. 91-98). Springer, Cham.

    Google Scholar 

  55. Zeng, M., Nguyen, L. T., Yu, B., Mengshoel, O. J., Zhu, J., Wu, P., & Zhang, J. (2014, November). Convolutional neural networks for human activity recognition using mobile sensors. In Mobile Computing, Applications and Services (MobiCASE), 2014 6th International Conference on (pp. 197-205). IEEE.

    Google Scholar 

  56. Alsheikh, M. A., Selim, A., Niyato, D., Doyle, L., Lin, S., & Tan, H. P. (2016, February). Deep Activity Recognition Models with Triaxial Accelerometers. In AAAI Workshop: Artificial Intelligence Applied to Assistive Technologies and Smart Environments.

    Google Scholar 

  57. Ordóñez, F. J., & Roggen, D. (2016). Deep convolutional and lstm recurrent neural networks for multimodal wearable activity recognition. Sensors, 16(1), 115.

    Google Scholar 

  58. Mohammad, Y., Matsumoto, K., & Hoashi, K. (2018). Primitive activity recognition from short sequences of sensory data. Applied Intelligence, 1-14.

    Google Scholar 

  59. Hossain, H. M., Al Haiz Khan, M. D., & Roy, N. (2018). DeActive: Scaling Activity Recognition with Active Deep Learning. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, 2(2), 66.

    Google Scholar 

  60. Qian, H., Pan, S. J., & Miao, C. (2018). Sensor-based Activity Recognition via Learning from Distributions. The Thirty-Second AAAI Conference on Artificial Intelligence (AAAI-18), 6262-6269.

    Google Scholar 

  61. Hammerla, N. Y., Halloran, S., & Ploetz, T. (2016). Deep, convolutional, and recurrent models for human activity recognition using wearables. arXiv preprint arXiv:1604.08880. In Proc. IJCAI.

    Google Scholar 

  62. Murahari, V. S., & Ploetz, T. (2018). On Attention Models for Human Activity Recognition. arXiv preprint arXiv:1805.07648. https://arxiv.org/abs/1805.07648.

  63. Ravi, D., Wong, C., Lo, B., & Yang, G. Z. (2016, June). Deep learning for human activity recognition: A resource efficient implementation on low-power devices. In Wearable and Implantable Body Sensor Networks (BSN), 2016 IEEE 13th International Conference on (pp. 71-76). IEEE.

    Google Scholar 

  64. Murad, A., & Pyun, J. Y. (2017). Deep recurrent neural networks for human activity recognition. Sensors, 17(11), 2556, doi: https://doi.org/10.3390/s17112556.

    Article  Google Scholar 

  65. Soomro, K., Zamir, A. R., & Shah, M. (2012). UCF101: A dataset of 101 human actions classes from videos in the wild. arXiv preprint arXiv:1212.0402, CRCV-TR-12-01.

    Google Scholar 

  66. Tran, D., Bourdev, L., Fergus, R., Torresani, L., & Paluri, M. (2015). Learning spatiotemporal features with 3d convolutional networks. In Proceedings of the IEEE international conference on computer vision (pp. 4489-4497).

    Google Scholar 

  67. Sun, L., Jia, K., Yeung, D. Y., & Shi, B. E. (2015). Human action recognition using factorized spatio-temporal convolutional networks. In Proceedings of the IEEE International Conference on Computer Vision (pp. 4597-4605).

    Google Scholar 

  68. Varol, G., Laptev, I., & Schmid, C. (2018). Long-term temporal convolutions for action recognition. IEEE transactions on pattern analysis and machine intelligence, 40(6), 1510-1517.

    Google Scholar 

  69. Wang, L., Qiao, Y., & Tang, X. (2015). Action recognition with trajectory-pooled deep-convolutional descriptors. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 4305-4314).

    Google Scholar 

  70. Feichtenhofer, C., Pinz, A., & Zisserman, A. (2016). Convolutional two-stream network fusion for video action recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 1933-1941).

    Google Scholar 

  71. Wang, L., Xiong, Y., Wang, Z., Qiao, Y., Lin, D., Tang, X., & Van Gool, L. (2016, October). Temporal segment networks: Towards good practices for deep action recognition. In European Conference on Computer Vision (pp. 20-36). Springer, Cham.

    Google Scholar 

  72. Feichtenhofer, C., Pinz, A., & Wildes, R. (2016). Spatiotemporal residual networks for video action recognition. In Advances in neural information processing systems (pp. 3468-3476).

    Google Scholar 

  73. Bilen, H., Fernando, B., Gavves, E., Vedaldi, A., & Gould, S. (2016). Dynamic image networks for action recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 3034-3042).

    Google Scholar 

  74. Srivastava, N., Mansimov, E., & Salakhudinov, R. (2015, June). Unsupervised learning of video representations using lstms. In International conference on machine learning (pp. 843-852).

    Google Scholar 

  75. Lev, G., Sadeh, G., Klein, B., & Wolf, L. (2016, October). Rnn fisher vectors for action recognition and image annotation. In European Conference on Computer Vision (pp. 833-850). Springer, Cham.

    Google Scholar 

Download references

Acknowledgments

This study is sponsored by Science and Engineering Research Board, Department of Science and Technology, Government of India via grant no. PDF/2016/003644.

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Singh, R., Nigam, S. (2019). Deep Neural Networks for Human Behavior Understanding. In: Singh, A., Mohan, A. (eds) Handbook of Multimedia Information Security: Techniques and Applications. Springer, Cham. https://doi.org/10.1007/978-3-030-15887-3_32

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-15887-3_32

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-15886-6

  • Online ISBN: 978-3-030-15887-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics