Skip to main content

Margin-Mix: Semi-Supervised Learning for Face Expression Recognition

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12368))

Abstract

In this paper, as we aim to construct a semi-supervised learning algorithm, we exploit the characteristics of the Deep Convolutional Networks to provide, for an input image, both an embedding descriptor and a prediction. The unlabeled data is combined with the labeled one in order to provide synthetic data, which describes better the input space. The network is asked to provide a large margin between clusters, while new data is self-labeled by the distance to class centroids, in the embedding space. The method is tested on standard benchmarks for semi-supervised learning, where it matches state of the art performance and on the problem of face expression recognition where it increases the accuracy by a noticeable margin .

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    Code is developed from Pytorch implementation of MixMatch available at https://github.com/YU1ut/MixMatch-pytorch. Additional details may be retrieved from the project webpageFootnote 2.

  2. 2.

    imag.pub.ro/transfer

  3. 3.

    Very recently several SSL methods were made public, although not published yet [4, 34, 39] that report improved results. However, they propose augmentation techniques that complement the self-labeling procedure. Beyond very recent publication, they may be used together with the proposed method.

References

  1. Athiwaratkun, B., Finzi, M., Izmailov, P., Wilson, A.G.: There are many consistent explanations of unlabeled data: why you should average. In: ICLR (2019)

    Google Scholar 

  2. Barsoum, E., Zhang, C., Ferrer, C., Zhang, Z.: Training deep networks for facial expression recognition with crowd-sourced label distribution. In: ICMI, pp. 279–283 (2016)

    Google Scholar 

  3. Bartlett, M., Hager, J., Ekman, P., Sejnowski, T.: Measuring facial expressions by computer image analysis. Psychophysiology 36(2), 253–263 (1999)

    Article  Google Scholar 

  4. Berthelot, D., et al.: Remixmatch: semi-supervised learning with distribution alignment and augmentation anchoring. arXiv preprint arXiv:1911.09785 (2019)

  5. Berthelot, D., Carlini, N., Goodfellow, I., Papernot, N., Oliver, A., Raffel, C.A.: Mixmatch: a holistic approach to semi-supervised learning. In: NIPS, pp. 5050–5060 (2019)

    Google Scholar 

  6. Chapelle, O., Schölkopf, B., Zien, A.: Semi-Supervised Learning. MIT Press (2006)

    Google Scholar 

  7. Coates, A., Ng, A., Lee, H.: An analysis of single-layer networks in unsupervised feature learning. In: AISTATS, pp. 215–223 (2011)

    Google Scholar 

  8. Corneanu, C., Simón, M., Cohn, J., Escalera, S.: Survey on RGB, 3D, thermal, and multimodal approaches for facial expression recognition: history, trends, and affect-related applications. IEEE Trans. PAMI 38(8), 1548–1568 (2016)

    Article  Google Scholar 

  9. Deng, J., Guo, J., Xue, N., Zafeiriou, S.: Arcface: additive angular margin loss for deep face recognition. In: CVPR (2019)

    Google Scholar 

  10. Dhillon, I.S., Guan, Y., Kulis, B.: Kernel k-means: spectral clustering and normalized cuts. In: ACM-SIGKDD, pp. 551–556 (2004)

    Google Scholar 

  11. Donahue, J., et al.: DeCAF: a deep convolutional activation feature for generic visual recognition. In: International Conference on Machine Learning (2014)

    Google Scholar 

  12. Ekman, P., Rosenberg, E.: What the Face Reveals: Basic and Applied Studies of Spontaneous Expression Using the FACS. Oxford Scholarship (2005)

    Google Scholar 

  13. Florea, C., Florea, L., Vertan, C., Badea, M., Racoviteanu, A.: Annealed label transfer for face expression recognition. In: BMVC, p. 12 (2019)

    Google Scholar 

  14. Goodfellow, I.J., et al.: Challenges in representation learning: a report on three machine learning contests. In: Lee, M., Hirose, A., Hou, Z.-G., Kil, R.M. (eds.) ICONIP 2013. LNCS, vol. 8228, pp. 117–124. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-42051-1_16

    Chapter  Google Scholar 

  15. Haeusser, P., Mordvintsev, A., Cremers, D.: Learning by association-a versatile semi-supervised training method for neural networks. In: CVPR, pp. 89–98 (2017)

    Google Scholar 

  16. Ho-Phuoc, T.: CIFAR10 to compare visual recognition performance between deep neural networks and humans. CoRR abs/1811.07270 (2018)

    Google Scholar 

  17. Kingma, D.P., Mohamed, S., Rezende, D.J., Welling, M.: Semi-supervised learning with deep generative models. In: NIPS, pp. 3581–3589 (2014)

    Google Scholar 

  18. Krizhevsky, A.: Learning multiple layers of features from tiny images. Technical report, MIT (2009)

    Google Scholar 

  19. Laine, S., Aila, T.: Temporal ensembling for semi-supervised learning. In: ICLR (2016)

    Google Scholar 

  20. LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–444 (2015)

    Article  Google Scholar 

  21. Lee, D.H.: Pseudo-label: the simple and efficient semi-supervised learning method for deep neural networks. In: ICML Workshops (2013)

    Google Scholar 

  22. Li, S., Deng, W.: Reliable crowdsourcing and deep locality-preserving learning for unconstrained facial expression recognition. IEEE Trans. Image Process. 28(1), 356–370 (2019)

    Article  MathSciNet  Google Scholar 

  23. Li, Y., Zeng, J., Shan, S., Chen, X.: Occlusion aware facial expression recognition using CNN with attention mechanism. IEEE Trans. Image Process. 28(5), 2439–2450 (2018)

    Article  MathSciNet  Google Scholar 

  24. Liu, W., Wen, Y., Yu, Z., Li, M., Raj, B., Song, L.: Sphereface: deep hypersphere embedding for face recognition. In: CVPR, pp. 212–220 (2017)

    Google Scholar 

  25. Miyato, T., Maeda, S.I., Koyama, M., Ishii, S.: Virtual adversarial training: a regularization method for supervised and semi-supervised learning. IEEE Trans. PAMI 41(8), 1979–1993 (2018)

    Article  Google Scholar 

  26. Netzer, Y., Wang, T., Coates, A., Bissacco, A., Wu, B., Ng, A.Y.: Learning multiple layers of features from tiny images. Technical report, Stanford (2009)

    Google Scholar 

  27. Odena, A.: Semi-supervised learning with generative adversarial networks. In: ICML Workshop on Data-Efficient Machine Learning (2016)

    Google Scholar 

  28. Oliver, A., Odena, A., Raffel, C., Cubuk, E.D., Goodfellow, I.J.: Realistic evaluation of deep semi-supervised learning algorithms. In: ICLR (2018)

    Google Scholar 

  29. Paszke, A., et al.: Pytorch: an imperative style, high-performance deep learning library. In: NIPS, pp. 8024–8035 (2019)

    Google Scholar 

  30. Peterson, J., Battleday, R., Griffiths, T., Russakovsky, O.: Human uncertainty makes classification more robust. In: ICCV, pp. 9617–9627 (2019)

    Google Scholar 

  31. Rasmus, A., Berglund, M., Honkala, M., Valpola, H., Raiko, T.: Semi-supervised learning with ladder networks. In: NIPS, pp. 3546–3554 (2015)

    Google Scholar 

  32. Ren, Y., Hu, K., Dai, X., Pan, L., Hoi, S.C., Xu, Z.: Semi-supervised deep embedded clustering. Neurocomputing 325, 121–130 (2019)

    Article  Google Scholar 

  33. Sariyanidi, E., Gunes, H., Cavallaro, A.: Automatic analysis of facial affect: a survey of registration, representation, and recognition. IEEE Trans. PAMI 37(6), 1113–1133 (2015)

    Article  Google Scholar 

  34. Sohn, K., et al.: Fixmatch: simplifying semi-supervised learning with consistency and confidence. arXiv preprint arXiv:2001.07685 (2020)

  35. Susskind, J., Littlewort, G., Bartlett, M., Movellan, J., Anderson, A.: Human and computer recognition of facial expressions of emotion. Neuropsychologia 45(1), 152–162 (2007)

    Article  Google Scholar 

  36. Tarvainen, A., Valpola, H.: Mean teachers are better role models: weight-averaged consistency targets improve semi-supervised deep learning results. In: NIPS, pp. 1195–1204 (2017)

    Google Scholar 

  37. Tran, E., Mayhew, M.B., Kim, H., Karande, P., Kaplan, A.D.: Facial expression recognition using a large out-of-context dataset. In: Proceedings of IEEE Conference on Winter Applications on Computer Vision, pp. 52–59 (2018)

    Google Scholar 

  38. Verma, V., Lamb, A., Kannala, J., Bengio, Y., Lopez-Paz, D.: Interpolation consistency training for semi-supervised learning. In: IJCAI (2019)

    Google Scholar 

  39. Wang, X., Kihara, D., Luo, J., Qi, G.J.: Enaet: Self-trained ensemble autoencoding transformations for semi-supervised learning. arXiv preprint arXiv:1911.09265 (2019)

  40. Wen, Y., Zhang, K., Li, Z., Qiao, Yu.: A discriminative feature learning approach for deep face recognition. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9911, pp. 499–515. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46478-7_31

    Chapter  Google Scholar 

  41. Zagoruyko, S., Komodakis, N.: Wide residual networks. In: BMVC (2016)

    Google Scholar 

  42. Zeng, J., Shan, S., Chen, X.: Facial expression recognition with inconsistently annotated datasets. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11217, pp. 227–243. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01261-8_14

    Chapter  Google Scholar 

  43. Zhang, H., Cisse, M., Dauphin, Y.N., Lopez-Paz, D.: Mixup: beyond empirical risk minimization. In: ICLR (2018)

    Google Scholar 

  44. Zhang, X., Fang, Z., Wen, Y., Li, Z., Qiao, Y.: Range loss for deep face recognition with long-tailed training data. In: CVPR, pp. 5409–5418 (2017)

    Google Scholar 

  45. Zhang, Z., Han, J., Deng, J., Xu, X., Ringeval, F., Schuller, B.: Leveraging unlabeled data for emotion recognition with enhanced collaborative semi-supervised learning. IEEE Access 6, 22196–22209 (2018)

    Article  Google Scholar 

  46. Zhao, S., Cai, H., Liu, H., Zhang, J., Chen, S.: Feature selection mechanism in CNNs for facial expression recognition. In: BMVC, p. 12 (2018)

    Google Scholar 

  47. Zheng, Y., Pal, D.K., Savvides, M.: Ring loss: convex feature normalization for face recognition. In: CVPR, pp. 5089–5097 (2018)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Corneliu Florea .

Editor information

Editors and Affiliations

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 3988 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Florea, C., Badea, M., Florea, L., Racoviteanu, A., Vertan, C. (2020). Margin-Mix: Semi-Supervised Learning for Face Expression Recognition. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, JM. (eds) Computer Vision – ECCV 2020. ECCV 2020. Lecture Notes in Computer Science(), vol 12368. Springer, Cham. https://doi.org/10.1007/978-3-030-58592-1_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-58592-1_1

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-58591-4

  • Online ISBN: 978-3-030-58592-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics