Abstract
Facial Expression Recognition (FER) systems aims to classify human emotions through facial expression as one of seven basic emotions: happiness, sadness, fear, disgust, anger, surprise and neutral. FER is a very challenging problem due to the subtle differences that exist between its categories. Even though convolutional neural networks (CNN) achieved impressive results in several computer vision tasks, they still do not perform as well in FER. Many techniques, like bilinear pooling and improved bilinear pooling, have been proposed to improve the CNN performance on similar problems. The accuracy enhancement they brought in multiple visual tasks, shows that their is still room for improvement for CNNs on FER. In this paper, we propose to use bilinear and improved bilinear pooling with CNNs for FER. This framework has been evaluated on three well known datasets, namely ExpW, FER2013 and RAF-DB. It has shown that the use of bilinear and improved bilinear pooling with CNNs can enhance the overall accuracy to nearly 3% for FER and achieve state-of-the-art results.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Acharya, D., Huang, Z., Pani Paudel, D., Van Gool, L.: Covariance pooling for facial expression recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 367–374 (2018)
Bishay, M., Palasek, P., Priebe, S., Patras, I.: SchiNet: automatic estimation of symptoms of schizophrenia from facial behaviour analysis. IEEE Trans. Affect. Comput., 1 (2019)
Cui, Y., Zhou, F., Wang, J., Liu, X., Lin, Y., Belongie, S.: Kernel pooling for convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2921–2930 (2017)
Devries, T., Biswaranjan, K., Taylor, G.W.: Multi-task learning of facial landmarks and expression. In: 2014 Canadian Conference on Computer and Robot Vision, pp. 98–103 (2014)
Fathallah, A., Abdi, L., Douik, A.: Facial expression recognition via deep learning. In: 2017 IEEE/ACS 14th International Conference on Computer Systems and Applications (AICCSA), pp. 745–750 (2017)
Gao, Y., Beijbom, O., Zhang, N., Darrell, T.: Compact bilinear pooling. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 317–326 (2016)
Goodfellow, I.J., et al.: Challenges in representation learning: a report on three machine learning contests. In: Lee, M., Hirose, A., Hou, Z.-G., Kil, R.M. (eds.) ICONIP 2013. LNCS, vol. 8228, pp. 117–124. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-42051-1_16
Guo, Y., Tao, D., Yu, J., Xiong, H., Li, Y., Tao, D.: Deep neural networks with relativity learning for facial expression recognition. In: 2016 IEEE International Conference on Multimedia & Expo Workshops (ICMEW), pp. 1–6. IEEE (2016)
Hamester, D., Barros, P., Wermter, S.: Face expression recognition with a 2-channel convolutional neural network. In: 2015 International Joint Conference on Neural Networks (IJCNN), pp. 1–8 (2015)
Kim, B.K., Dong, S.Y., Roh, J., Kim, G., Lee, S.Y.: Fusing aligned and non-aligned face information for automatic affect recognition in the wild: a deep learning approach. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 48–57 (2016)
Li, S., Deng, W.: Deep facial expression recognition: a survey. arXiv preprint arXiv:1804.08348 (2018)
Li, S., Deng, W.: Reliable crowdsourcing and deep locality-preserving learning for unconstrained facial expression recognition. IEEE Trans. Image Process. 28(1), 356–370 (2018)
Li, S., Deng, W., Du, J.: Reliable crowdsourcing and deep locality-preserving learning for expression recognition in the wild. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2584–2593. IEEE (2017)
Lian, Z., Li, Y., Tao, J.-H., Huang, J., Niu, M.-Y.: Expression analysis based on face regions in real-world conditions. Int. J. Autom. Comput. 17(1), 96–107 (2019). https://doi.org/10.1007/s11633-019-1176-9
Lin, T.Y., Maji, S.: Improved bilinear pooling with CNNs. arXiv preprint arXiv:1707.06772 (2017)
Lin, T.Y., RoyChowdhury, A., Maji, S.: Bilinear CNN models for fine-grained visual recognition. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1449–1457 (2015)
Liu, Z., Li, S., Deng, W.: Boosting-poof: boosting part based one vs one feature for facial expression recognition in the wild. In: 2017 12th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2017), pp. 967–972. IEEE (2017)
Mahmoudi, M.A., Chetouani, A., Boufera, F., Tabia, H.: Kernelized dense layers for facial expression recognition. In: 2020 IEEE International Conference on Image Processing (ICIP), pp. 2226–2230 (2020)
Mahmoudi, M.A., Chetouani, A., Boufera, F., Tabia, H.: Learnable pooling weights for facial expression recognition. Pattern Recogn. Lett. 138, 644–650 (2020)
Mollahosseini, A., Hasani, B., Mahoor, M.H.: AffectNet: a database for facial expression, valence, and arousal computing in the wild. IEEE Trans. Affect. Comput. 10, 18–31 (2017)
Nguyen, D., Nguyen, K., Sridharan, S., Dean, D., Fookes, C.: Deep spatio-temporal feature fusion with compact bilinear pooling for multimodal emotion recognition. Comput. Vis. Image Underst. 174, 33–42 (2018)
Pons, G., Masip, D.: Multi-task, multi-label and multi-domain learning with residual convolutional networks for emotion recognition. arXiv preprint arXiv:1802.06664 (2018)
Tang, Y.: Deep learning using linear support vector machines. arXiv preprint arXiv:1306.0239 (2013)
Tenenbaum, J.B., Freeman, W.T.: Separating style and content with bilinear models. Neural Comput. 12(6), 1247–1283 (2000)
Zhang, Z., Luo, P., Loy, C.C., Tang, X.: From facial expression recognition to interpersonal relation prediction. Int. J. Comput. Vis. 126(5), 550–569 (2018). https://doi.org/10.1007/s11263-017-1055-1
Zhou, F., Kong, S., Fowlkes, C., Chen, T., Lei, B.: Fine-grained facial expression analysis using dimensional emotion model. arXiv preprint arXiv:1805.01024 (2018)
Zou, X., Wang, Z., Li, Q., Sheng, W.: Integration of residual network and convolutional neural network along with various activation functions and global pooling for time series classification. Neurocomputing, 367, 39–45 (2019)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Mahmoudi, M.A., Chetouani, A., Boufera, F., Tabia, H. (2021). Improved Bilinear Model for Facial Expression Recognition. In: Djeddi, C., Kessentini, Y., Siddiqi, I., Jmaiel, M. (eds) Pattern Recognition and Artificial Intelligence. MedPRAI 2020. Communications in Computer and Information Science, vol 1322. Springer, Cham. https://doi.org/10.1007/978-3-030-71804-6_4
Download citation
DOI: https://doi.org/10.1007/978-3-030-71804-6_4
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-71803-9
Online ISBN: 978-3-030-71804-6
eBook Packages: Computer ScienceComputer Science (R0)