Abstract
Automatic facial affect recognition has wide applications in areas like education, gaming, software development, automotives, medical care, etc. but it is non trivial task to achieve appreciable performance on in-the-wild data sets. Though these datasets represent real-world scenarios better than in-lab data sets, they suffer from the problem of incomplete labels due to difficulty in annotation. Inspired by semi-supervised learning, this paper presents our submission to the Multi-Task-Learning (MTL) Challenge and Learning from Synthetic Data (LSD) Challenge at the 4th Affective Behavior Analysis in-the-wild (ABAW) 2022 Competition. The three tasks that are considered in MTL challenge are valence-arousal estimation, classification of expressions into basic emotions and detection of action units. Our method Semi-supervised Learning based Multi-task Facial Affect Recognition titled SS-MFAR uses a deep residual network as backbone along with task specific classifiers for each of the tasks. It uses adaptive thresholds for each expression class to select confident samples using semi-supervised learning from samples with incomplete labels. The performance is validated on challenging s-Aff-Wild2 dataset. Source code is available at https://github.com/1980x/ABAW2022DMACS.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Chang, Y., Wu, Y., Miao, X., Wang, J., Wang, S.: Multi-task learning for emotion descriptors estimation at the fourth abaw challenge. arXiv preprint arXiv:2207.09716 (2022)
Cubuk, E.D., Zoph, B., Shlens, J., Le, Q.V.: Randaugment: practical data augmentation with no separate search. arXiv preprint arXiv:1909.13719 2(4), 7 (2019)
Guo, Y., Zhang, L., Hu, Y., He, X., Gao, J.: MS-Celeb-1M: a dataset and benchmark for large-scale face recognition. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9907, pp. 87–102. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46487-9_6
Handrich, S., Dinges, L., Saxen, F., Al-Hamadi, A., Wachmuth, S.: Simultaneous prediction of valence/arousal and emotion categories in real-time. In: 2019 IEEE International Conference on Signal and Image Processing Applications (ICSIPA), pp. 176–180 (2019). https://doi.org/10.1109/ICSIPA45851.2019.8977743
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Jacob, G.M., Stenger, B.: Facial action unit detection with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7680–7689 (2021)
Kollias, D.: ABAW: learning from synthetic data & multi-task learning challenges. arXiv preprint arXiv:2207.01138 (2022)
Kollias, D.: Abaw: Valence-arousal estimation, expression recognition, action unit detection & multi-task learning challenges. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2328–2336 (2022)
Kollias, D., Nicolaou, M.A., Kotsia, I., Zhao, G., Zafeiriou, S.: Recognition of affect in the wild using deep neural networks. In: Computer Vision and Pattern Recognition Workshops (CVPRW), 2017 IEEE Conference on, pp. 1972–1979. IEEE (2017)
Kollias, D., Schulc, A., Hajiyev, E., Zafeiriou, S.: Analysing affective behavior in the first ABAW 2020 competition. arXiv preprint arXiv:2001.11409 (2020)
Kollias, D., Sharmanska, V., Zafeiriou, S.: Distribution matching for heterogeneous multi-task learning: a large-scale face study. arXiv preprint arXiv:2105.03790 (2021)
Kollias, D., et al.: Deep affect prediction in-the-wild: aff-wild database and challenge, deep architectures, and beyond. Int. J. Comput. Vis. 127(6), 907–929 (2019). https://doi.org/10.1007/s11263-019-01158-4
Kollias, D., Zafeiriou, S.: Aff-wild2: extending the aff-wild database for affect recognition. arXiv preprint arXiv:1811.07770 (2018)
Kollias, D., Zafeiriou, S.: A multi-task learning & generation framework: valence-arousal, action units & primary expressions. arXiv preprint arXiv:1811.07771 (2018)
Kollias, D., Zafeiriou, S.: Expression, affect, action unit recognition: aff-wild2, multi-task learning and arcface. arXiv preprint arXiv:1910.04855 (2019)
Kollias, D., Zafeiriou, S.: VA-StarGAN: continuous affect generation. In: Blanc-Talon, J., Delmas, P., Philips, W., Popescu, D., Scheunders, P. (eds.) ACIVS 2020. LNCS, vol. 12002, pp. 227–238. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-40605-9_20
Kollias, D., Zafeiriou, S.: Affect analysis in-the-wild: valence-arousal, expressions, action units and a unified framework. arXiv preprint arXiv:2103.15792 (2021)
Kołakowska, A., Landowska, A., Szwoch, M., Szwoch, W., Wróbel, M.: Emotion recognition and its applications. Adv. Intell. Syst. Comput. 300, 51–62 (2014)
Lei, J., et al.: Mid-level representation enhancement and graph embedded uncertainty suppressing for facial expression recognition. arXiv preprint arXiv:2207.13235 (2022)
Li, H., Wang, N., Yang, X., Wang, X., Gao, X.: Towards semi-supervised deep facial expression recognition with an adaptive confidence margin. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4166–4175 (2022)
Li, S., Xu, Y., Wu, H., Wu, D., Yin, Y., Cao, J., Ding, J.: Facial affect analysis: Learning from synthetic data & multi-task learning challenges. arXiv preprint arXiv:2207.09748 (2022)
Li, Y., Sun, H., Liu, Z., Han, H.: Affective behaviour analysis using pretrained model with facial priori. arXiv preprint arXiv:2207.11679 (2022)
Meng, L., et al.: Valence and arousal estimation based on multimodal temporal-aware features for videos in the wild. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2345–2352 (2022)
Miao, X., Wang, J., Chang, Y., Wu, Y., Wang, S.: Hand-assisted expression recognition method from synthetic images at the fourth ABAW challenge. arXiv preprint arXiv:2207.09661 (2022)
Mollahosseini, A., Hasani, B., Mahoor, M.H.: AffectNet: a database for facial expression, valence, and arousal computing in the wild. IEEE Trans. Affect. Comput. 10, 18–31 (2017)
Nguyen, D.K., Pant, S., Ho, N.H., Lee, G.S., Kim, S.H., Yang, H.J.: Multi-task cross attention network in facial behavior analysis. arXiv preprint arXiv:2207.10293 (2022)
Oh, G., Jeong, E., Lim, S.: Causal affect prediction model using a facial image sequence. arXiv preprint arXiv:2107.03886 (2021)
Oh, G., Jeong, E., Lim, S.: Causal affect prediction model using a past facial image sequence. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 3550–3556 (2021)
Peng, S., Zhang, L., Ban, Y., Fang, M., Winkler, S.: A deep network for arousal-valence emotion prediction with acoustic-visual cues. arXiv preprint arXiv:1805.00638 (2018)
Savchenko, A.V.: HSE-NN team at the 4th ABAW competition: multi-task emotion recognition and learning from synthetic images. arXiv preprint arXiv:2207.09508 (2022)
Shao, Z., Liu, Z., Cai, J., Wu, Y., Ma, L.: Facial action unit detection using attention and relation learning. IEEE Trans. Affect. Comput. 13, 1274–1289 (2019)
Tang, C., Zheng, W., Yan, J., Li, Q., Li, Y., Zhang, T., Cui, Z.: View-independent facial action unit detection. In: 2017 12th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2017), pp. 878–882. IEEE (2017)
Wang, L., Li, H., Liu, C.: Hybrid CNN-transformer model for facial affect recognition in the abaw4 challenge. arXiv preprint arXiv:2207.10201 (2022)
Xiaohua, W., Muzi, P., Lijuan, P., Min, H., Chunhua, J., Fuji, R.: Two-level attention with two-stage multi-task learning for facial emotion recognition. J. Vis. Commun. Image Represent. 62, 217–225 (2019). https://doi.org/10.1016/j.jvcir.2019.05.009,https://www.sciencedirect.com/science/article/pii/S1047320319301646
Zafeiriou, S., Kollias, D., Nicolaou, M.A., Papaioannou, A., Zhao, G., Kotsia, I.: Aff-wild: valence and arousal in-the-wild challenge. In: Computer Vision and Pattern Recognition Workshops (CVPRW), 2017 IEEE Conference on, pp. 1980–1987. IEEE (2017)
Zaman, K., Sun, Z., Shah, S.M., Shoaib, M., Pei, L., Hussain, A.: Driver emotions recognition based on improved faster R-CNN and neural architectural search network. Symmetry 14(4) (2022). https://doi.org/10.3390/sym14040687, https://www.mdpi.com/2073-8994/14/4/687
Zhang, T., et al.: Emotion recognition based on multi-task learning framework in the ABAW4 challenge. arXiv e-prints pp. arXiv-2207 (2022)
Acknowledgments
We dedicate this work to Our Guru Bhagawan Sri Sathya Sai Baba, Divine Founder Chancellor of Sri Sathya Sai Institute of Higher Learning, Prasanthi Nilayam, Andhra Pradesh, India. We are also grateful to D. Kollias for all patience and support.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Gera, D., Raj Kumar, B.V., Badveeti, N.S.K., Balasubramanian, S. (2023). Facial Affect Recognition Using Semi-supervised Learning with Adaptive Threshold. In: Karlinsky, L., Michaeli, T., Nishino, K. (eds) Computer Vision – ECCV 2022 Workshops. ECCV 2022. Lecture Notes in Computer Science, vol 13806. Springer, Cham. https://doi.org/10.1007/978-3-031-25075-0_3
Download citation
DOI: https://doi.org/10.1007/978-3-031-25075-0_3
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-25074-3
Online ISBN: 978-3-031-25075-0
eBook Packages: Computer ScienceComputer Science (R0)