Skip to main content
Log in

Mutual Improvement Between Temporal Ensembling and Virtual Adversarial Training

  • Published:
Neural Processing Letters Aims and scope Submit manuscript

Abstract

The research of semi-supervised learning (SSL) is of great significance because it is very expensive to collect a large quantity of data with labels in some fields. Two recent deep learning-based SSL algorithms, temporal ensembling and virtual adversarial training (VAT), have achieved state-of-the-art accuracy in some classical SSL tasks, while both of them have shortcomings. Because of simply adding random noise to training data, temporal ensembling is not fully utilized. In addition, VAT has considerable time costs because there are two inferences in each epoch for unlabeled samples. In this paper, we propose the use of virtual adversarial perturbations (VAP) in temporal ensembling rather than random noises to improve performance. Moreover, we also find that reusing VAP can accelerate the training process of VAT without losing obvious accuracy. The two methods are validated on MNIST, FashionMNIST and SVHN.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  1. Baldi P, Sadowski PJ (2013) Understanding dropout. In: NIPS

  2. Belkin M, Niyogi P, Sindhwani V (2006) Manifold regularization: a geometric framework for learning from labeled and unlabeled examples. J Mach Learn Res 7:2399–2434

    MathSciNet  MATH  Google Scholar 

  3. Blum A, Mitchell TM (1998) Combining labeled and unlabeled data with co-training. In: COLT

  4. Chapelle O, Scholkopf B, Zien A (2009) Semi-supervised learning (Chapelle, O. et al., eds.; 2006) [book reviews]. IEEE Trans Neural Netw 20(3):542

  5. Deng J, Dong W, Socher R, Li L-J, Li K, Fei-FL (2009) Imagenet: a large-scale hierarchical image database. In: 2009 IEEE conference on computer vision and pattern recognition, pp 248–255

  6. Gal Y, Ghahramani Z (2016) Dropout as a Bayesian approximation: representing model uncertainty in deep learning. In: ICML

  7. Golub GH, Van Der Vorst HA (2000) Eigenvalue computation in the 20th century. J Comput Appl Math 123(1–2):35–65

    Article  MathSciNet  Google Scholar 

  8. Goodfellow IJ, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville AC, Bengio Y (2014) Generative adversarial nets. In: NIPS

  9. Goodfellow IJ, Shlens J, Szegedy C (2015) Explaining and harnessing adversarial examples. CoRR, arXiv:1412.6572

  10. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR), pp 770–778

  11. Hoffer E, Ailon N (2016) Semi-supervised deep learning by metric embedding. ArXiv, arXiv:1611.01449

  12. Hong C, Jun Y, Tao D, Meng W (2015) Image-based three-dimensional human pose recovery by multiview locality-sensitive sparse retrieval. IEEE Trans Ind Electron 62(6):3742–3751

    Google Scholar 

  13. Kingma DP, Ba J (2014) Adam: a method for stochastic optimization. arXiv preprint. arXiv:1412.6980

  14. Kingma DP, Mohamed S, Rezende DJ, Welling M (2014) Semi-supervised learning with deep generative models. In: NIPS

  15. Krizhevsky A, Hinton G (2009) Learning multiple layers of features from tiny images, vol 1. University of Toronto, Technical report, p 7

    Google Scholar 

  16. Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: International conference on neural information processing systems

  17. Laine S, Aila T (2016) Temporal ensembling for semi-supervised learning. arXiv preprint. arXiv:1610.02242

  18. LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324

    Article  Google Scholar 

  19. Lee DH (2013) Pseudo-label: the simple and efficient semi-supervised learning method for deep neural networks. In: Workshop on challenges in representation learning, ICML, vol 3. p 2

  20. Li M, Lv J, Wang J, Sang Y (2019) An abstract painting generation method based on deep generative model. In Neural Processing Letters, pp 1–12

  21. MacKay DJC (2003) Information theory, inference, and learning algorithms. IEEE Trans Inf Theory 50:2544–2545

    Google Scholar 

  22. Miyato T, Maeda SI, Koyama M, Ishii S (2017) Virtual adversarial training: a regularization method for supervised and semi-supervised learning. In: IEEE Trans Pattern Anal Mach Intell 99:1

  23. Nair V, Hinton GE (2010) Rectified linear units improve restricted boltzmann machines. In: ICML

  24. Netzer Y, Wang T, Coates A, Bissacco A, Wu B, Ng AY (2011) Reading digits in natural images with unsupervised feature learning

  25. Ng YC, Colombo N, Silva R (2018) Bayesian semi-supervised learning with graph Gaussian processes

  26. Nigam K, Mccallum ST, Mitchell T (2000) Text classification from labeled and unlabeled documents using EM. Mach Learn 39(2–3):103–134

    Article  Google Scholar 

  27. Oliver A, Odena A, Raffel CA, Cubuk ED, Goodfellow I (2018) Realistic evaluation of deep semi-supervised learning algorithms. In: Advances in neural information processing systems, pp 3235–3246

  28. Park S, Park J-K, Shin S-J, Moon I-C (2018) Adversarial dropout for supervised and semi-supervised learning. In: AAAI

  29. Paszke A, Gross S, Chintala S, Chanan G, Yang E, DeVito Z, Lin Z, Desmaison A, Antiga L, Lerer A (2017) Automatic differentiation in pytorch

  30. Rasmus A, Berglund M, Honkala M, Valpola H, Raiko T (2015) Semi-supervised learning with ladder networks. In: NIPS

  31. Salimans T, Kingma DP (2016) Weight normalization: a simple reparameterization to accelerate training of deep neural networks. In: Advances in neural information processing systems. pp 901–909

  32. Shinoda S, Worrall DE, Brostow GJ (2017) Virtual adversarial ladder networks for semi-supervised learning. CoRR, arXiv:1711.07476

  33. Sindhwani V, Niyogi P, Belkin M (2005) A co-regularization approach to semi-supervised learning with multiple views. In: ICML 2005

  34. Springenberg JT (2016) Unsupervised and semi-supervised learning with categorical generative adversarial networks. CoRR, arXiv:1511.06390

  35. Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15(1):1929–1958

    MathSciNet  MATH  Google Scholar 

  36. Szegedy C, Liu W, Jia Y, Sermanet P, Reed SE, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: 2015 IEEE conference on computer vision and pattern recognition (CVPR), pp 1–9

  37. Szegedy C, Zaremba W, Sutskever I, Bruna J, Erhan D, Goodfellow IJ, Fergus R (2014) Intriguing properties of neural networks. CoRR, arXiv:1312.6199

  38. Tarvainen A, Valpola H (2018) Mean teachers are better role models: weight-averaged consistency targets improve semi-supervised deep learning results

  39. Yarowsky D (1995) Unsupervised word sense disambiguation rivaling supervised methods. In: Proceedings of annual meeting of the association for computational linguistics, pp 189–196

  40. Yu B, Wu J, Zhu Z (2018) Tangent-normal adversarial regularization for semi-supervised learning. CoRR, arXiv:1808.06088

  41. Yu J, Yang X, Gao F, Tao D (2016) Deep multimodal distance metric learning using click constraints for image ranking. IEEE Trans Cybern 99:1–11

  42. Yu JS, Rui Y, Tao D (2014) Click prediction for web image reranking using multimodal sparse coding. IEEE Trans Image Process 23:2019–2032

    Article  MathSciNet  Google Scholar 

  43. Yu J, Tan M, Zhang H, Tao D, Rui Y (2019) Hierarchical deep click feature prediction for fine-grained image recognition. IEEE Trans Pattern Anal Mach Intell. https://doi.org/10.1109/TPAMI.2019.2932058

    Article  Google Scholar 

  44. Yu J, Zhu C, Zhang J, Huang Q, Tao D (2019) Spatial pyramid-enhanced NetVLAD with weighted triplet loss for place recognition. IEEE Trans Neural Netw Learn Sys. https://doi.org/10.1109/TNNLS.2019.2908982

    Article  Google Scholar 

  45. Zhang JW, Yu JS, Tao D (2018) Local deep-feature alignment for unsupervised dimension reduction. IEEE Trans Image Process 27:2420–2432

    Article  MathSciNet  Google Scholar 

  46. Zhu X, Ghahramani Z (2002) Learning from labeled and unlabeled data. Tech Report 3175(2004):237–244

    Google Scholar 

Download references

Acknowledgements

The work was supported by the National Key R&D Program of China under Grant 2017YFC1501301, the Natural Science Foundation of China under Grants 61876219, 61503144, 61673188 and 61761130081, the Natural Science Foundation of Hubei Province of China under Grant 2017CFB519.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Cheng Lian.

Ethics declarations

Conflict of interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhou, W., Lian, C., Zeng, Z. et al. Mutual Improvement Between Temporal Ensembling and Virtual Adversarial Training. Neural Process Lett 51, 1111–1124 (2020). https://doi.org/10.1007/s11063-019-10132-7

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11063-019-10132-7

Keywords

Navigation