Skip to main content

End to End Deep Learning for Single Step Real-Time Facial Expression Recognition

  • Conference paper
  • First Online:
Video Analytics. Face and Facial Expression Recognition and Audience Measurement (VAAM 2016, FFER 2016)

Abstract

In recent years, a lot of research has been carried out in face detection and facial expression recognition. Very few of them are capable of achieving these at real time with a very high accuracy. In this paper we present a real time end to end, single step face and facial expression recognition technique which performs at a speed of more than 10 fps (frames per second). We use an end-to-end deep learning approach for localization and expression classification. On the CK+ [1] dataset we get a 10-fold validation accuracy of 94.8% on 640 * 480 images. We have also created a webcam interface, which classifies the emotion of a person at 10 fps, which proves our claim that facial expression recognition has approached real time speed with very decent accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Lucey, P., Cohn, J.F., Kanade, T., Saragih, J., Ambadar, Z.: The extended Cohn-Kanade dataset (CK+): a complete dataset for action unit and emotion-specific expression. In: Proceedings of the 3rd IEEE Workshop on CVPR for Human Communication Behaviour Analysis, San Francisco, CA, USA (2010)

    Google Scholar 

  2. Video and image based emotion recognition challenges in the wild: EmotiW 2015. In: ACM International Conference on Multimodal Interaction (ICMI) (2015)

    Google Scholar 

  3. Audio/visual emotion challenge and workshop: AVEC 2016. In: Proceedings of ACM Multimedia (2016)

    Google Scholar 

  4. Tian, Y.-L., Kanade, T., Cohn, J.: Recognizing action units for facial expression analysis. IEEE Trans. Pattern Anal. Mach. Intell. 23(2), 97–115 (2001)

    Article  Google Scholar 

  5. Zhang, L., Tjondronegoro, D., Chandran, V.: Representation of facial expression categories in continuous arousal-valence space: feature and correlation. Image Vis. Comput. 32(12), 1067–1079 (2014)

    Article  Google Scholar 

  6. Liu, M., Wang, R., Li, S., Shan, S., Huang, Z., Chen, X.: Combining multiple kernel methods on Riemannian manifold for emotion recognition in the wild. In: Proceedings of the 16th International Conference on Multimodal Interaction, ICMI 2014, pp. 494–501. ACM, New York (2014)

    Google Scholar 

  7. Sun, B., Li, L., Zuo, T., Chen, Y., Zhou, G., Wu, X.: Combining multimodal features with hierarchical classifier fusion for emotion recognition in the wild. In: Proceedings of the 16th International Conference on Multimodal Interaction, ICMI 2014, pp. 481–486. ACM, New York (2014)

    Google Scholar 

  8. Ng, H.-W., Nguyen, V.D., Vonikakis, V., Winkler, S.: Deep learning for emotion recognition on small datasets using transfer learning. In: Proceedings of the 2015 ACM on International Conference on Multimodal Interaction, pp. 443–449. ACM (2015)

    Google Scholar 

  9. Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: NIPS (2015)

    Google Scholar 

  10. Everingham, M., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The PASCAL visual object classes challenge 2007 (VOC 2007) results (2007)

    Google Scholar 

  11. Geiger, A., Lenz, P., Ortasun, R.: Are we ready for autonomous driving? The KITTI vision benchmark suite. In: CVPR (2012)

    Google Scholar 

  12. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: ILSVRC (2014)

    Google Scholar 

  13. Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: NIPS (2012)

    Google Scholar 

  14. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR (2016)

    Google Scholar 

  15. Challenges in Representation Learning: Facial Expression Recognition Challenge. Kaggle Inc.

    Google Scholar 

  16. https://github.com/rbgirshick/py-faster-rcnn

  17. https://www.dropbox.com/s/xtr4yd4i5e0vw8g/iccv15_tutorial_training_rbg.pdf?dl=0

  18. Ebrahimi Kahou, S., Michalski, V., Konda, K., Memisevic, R., Pal, C.: Recurrent neural networks for emotion recognition in video. In: Proceedings of the 2015 ACM on International Conference on Multimodal Interaction, pp. 467–474. ACM (2015)

    Google Scholar 

  19. http://dlib.net/

  20. Jung, H., Lee, S., Park, S., Lee, I., Ahn, C., Kim, J.: Deep temporal appearance-geometry network for facial expression recognition (2015). arXiv:1503.01532v1

  21. Ghimire, D., Lee, H., Li, Z.-N., Heong, S., Park, S.H., Choi, H.S.: Recognition of facial expressions based on tracking and selection of discriminative geometric features. Int. J. Multimedia Ubiquit. Eng. 10(3), 35–44 (2015)

    Article  Google Scholar 

  22. Korattikara, A., Rathod, V., Murphy, K., Welling, M.: Bayesian dark knowledge. In: NIPS (2015)

    Google Scholar 

  23. Yu, Z., Zhang, C.: Image based static facial expression recognition with multiple deep network learning. In: Proceedings of the 2015 ACM International Conference Multimodal Interaction, pp. 435–442. ACM

    Google Scholar 

  24. Kim, B.-K., Lee, H., Roh, J., Lee, S.-Y.: Hierarchical committee of deep CNNs with exponentially-weighted decision fusion for static facial expression recognition. In: Proceedings of the 2015 ACM on International Conference on Multimodal Interaction, ICMI 2015, pp. 427–434. ACM, New York (2015)

    Google Scholar 

  25. Russakovsky*, O., Deng*, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., Berg, A.C., Fei-Fei, L.: ImageNet large scale visual recognition challenge. IJCV 115, 211–252 (2015). (* = equal contribution)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ye-Hoon Kim .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Reddy, B., Kim, YH., Yun, S., Jang, J., Hong, S. (2017). End to End Deep Learning for Single Step Real-Time Facial Expression Recognition. In: Nasrollahi, K., et al. Video Analytics. Face and Facial Expression Recognition and Audience Measurement. VAAM FFER 2016 2016. Lecture Notes in Computer Science(), vol 10165. Springer, Cham. https://doi.org/10.1007/978-3-319-56687-0_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-56687-0_8

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-56686-3

  • Online ISBN: 978-3-319-56687-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics