Abstract
In real-world scenarios, facial images obtained by many devices often exhibit low resolution. However, the performance significantly degrades when we apply the existing methods in low-resolution facial expression recognition. Therefore, addressing the problem of low-resolution images in facial expression recognition becomes an important undertaking. Previous attempts to tackle this problem have been limited. For this, we propose a novel Attention and Relative Distance Alignment (ARDA) method by integrating knowledge distillation in low-resolution facial expression recognition. Specifically, the Attention Alignment module guides the student model to focus on the most crucial region of the facial image by enabling the low-resolution student model to learn the attention map of the high-resolution teacher model. The Relative Distance Alignment module utilizes the relative distance between facial image features to transfer differences between different low-resolution facial images from the teacher model to the student model, helping the student model better grasp the differences between expressions. Extensive experiments have shown that the ARDA method effectively transfers knowledge from high-resolution teacher model to low-resolution student model, achieving state-of-the-art performance in synthetic low-resolution facial expression recognition datasets.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Barsoum, E., Zhang, C., Ferrer, C.C., Zhang, Z.: Training deep networks for facial expression recognition with crowd-sourced label distribution. In: Proceedings of the 18th ACM International Conference on Multimodal Interaction, pp. 279–283 (2016)
Goodfellow, I.J., et al.: Challenges in representation learning: a report on three machine learning contests. In: Lee, M., Hirose, A., Hou, Z.-G., Kil, R.M. (eds.) ICONIP 2013, Part III. LNCS, vol. 8228, pp. 117–124. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-42051-1_16
Guo, Y., Zhang, L., Hu, Y., He, X., Gao, J.: MS-Celeb-1M: a dataset and benchmark for large-scale face recognition. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016, Part III. LNCS, vol. 9907, pp. 87–102. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46487-9_6
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Hinton, G., Vinyals, O., Dean, J.: Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531 (2015)
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
Li, S., Deng, W., Du, J.: Reliable crowdsourcing and deep locality-preserving learning for expression recognition in the wild. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2852–2861 (2017)
Li, Z., Arora, S.: An exponential learning rate schedule for deep learning. arXiv preprint arXiv:1910.07454 (2019)
Ma, C., Jiang, Z., Rao, Y., Lu, J., Zhou, J.: Deep face super-resolution with iterative collaboration between attentive recovery and landmark estimation. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2020)
Milan, A., Leal-Taixé, L., Reid, I., Roth, S., Schindler, K.: MOT16: a benchmark for multi-object tracking. arXiv preprint arXiv:1603.00831 (2016)
Nan, F., et al.: Feature super-resolution based facial expression recognition for multi-scale low-resolution images. Knowl.-Based Syst. 236, 107678 (2022)
Romero, A., Ballas, N., Kahou, S.E., Chassang, A., Gatta, C., Bengio, Y.: Fitnets: hints for thin deep nets. arXiv preprint arXiv:1412.6550 (2014)
Selvaraju, R.R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., Batra, D.: Grad-cam: visual explanations from deep networks via gradient-based localization. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 618–626 (2017)
She, J., Hu, Y., Shi, H., Wang, J., Shen, Q., Mei, T.: Dive into ambiguity: latent distribution mining and pairwise uncertainty estimation for facial expression recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6248–6257 (2021)
Wang, K., Peng, X., Yang, J., Lu, S., Qiao, Y.: Suppressing uncertainties for large-scale facial expression recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6897–6906 (2020)
Zagoruyko, S., Komodakis, N.: Paying more attention to attention: improving the performance of convolutional neural networks via attention transfer. arXiv preprint arXiv:1612.03928 (2016)
Zeng, J., Shan, S., Chen, X.: Facial expression recognition with inconsistently annotated datasets. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 222–237 (2018)
Zhang, K., Zhang, Z., Li, Z., Qiao, Y.: Joint face detection and alignment using multitask cascaded convolutional networks. IEEE Signal Process. Lett. 23(10), 1499–1503 (2016)
Zhang, Y., Wang, C., Deng, W.: Relative uncertainty learning for facial expression recognition. Adv. Neural. Inf. Process. Syst. 34, 17616–17627 (2021)
Zhang, Y., Wang, C., Ling, X., Deng, W.: Learn from all: erasing attention consistency for noisy label facial expression recognition. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds.) ECCV 2022, Part XXVI. LNCS, vol. 13686, pp. 418–434. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-19809-0_24
Acknowledgement
This work was supported by the National Key R &D Programme of China (2022YFC3803202), Major Project of Anhui Province under Grant 202203a05020011. This work was done in Anhui Province Key Laboratory of Affective Computing and Advanced Intelligent Machine.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
An, L., Sun, X., Zhang, Z., Wang, M. (2024). Attention and Relative Distance Alignment for Low-Resolution Facial Expression Recognition. In: Liu, Q., et al. Pattern Recognition and Computer Vision. PRCV 2023. Lecture Notes in Computer Science, vol 14429. Springer, Singapore. https://doi.org/10.1007/978-981-99-8469-5_18
Download citation
DOI: https://doi.org/10.1007/978-981-99-8469-5_18
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-8468-8
Online ISBN: 978-981-99-8469-5
eBook Packages: Computer ScienceComputer Science (R0)