Mutual Improvement Between Temporal Ensembling and Virtual Adversarial Training

Zhou, Wei; Lian, Cheng; Zeng, Zhigang; Su, Yixin

doi:10.1007/s11063-019-10132-7

Mutual Improvement Between Temporal Ensembling and Virtual Adversarial Training

Published: 24 October 2019

Volume 51, pages 1111–1124, (2020)
Cite this article

Neural Processing Letters Aims and scope Submit manuscript

Wei Zhou¹,
Cheng Lian¹,
Zhigang Zeng² &
…
Yixin Su¹

405 Accesses
2 Citations
Explore all metrics

Abstract

The research of semi-supervised learning (SSL) is of great significance because it is very expensive to collect a large quantity of data with labels in some fields. Two recent deep learning-based SSL algorithms, temporal ensembling and virtual adversarial training (VAT), have achieved state-of-the-art accuracy in some classical SSL tasks, while both of them have shortcomings. Because of simply adding random noise to training data, temporal ensembling is not fully utilized. In addition, VAT has considerable time costs because there are two inferences in each epoch for unlabeled samples. In this paper, we propose the use of virtual adversarial perturbations (VAP) in temporal ensembling rather than random noises to improve performance. Moreover, we also find that reusing VAP can accelerate the training process of VAT without losing obvious accuracy. The two methods are validated on MNIST, FashionMNIST and SVHN.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Machine Learning: Algorithms, Real-World Applications and Research Directions

Article 22 March 2021

Deep Learning: A Comprehensive Overview on Techniques, Taxonomy, Applications and Research Directions

Article 18 August 2021

Review of deep learning: concepts, CNN architectures, challenges, applications, future directions

Article Open access 31 March 2021

References

Baldi P, Sadowski PJ (2013) Understanding dropout. In: NIPS
Belkin M, Niyogi P, Sindhwani V (2006) Manifold regularization: a geometric framework for learning from labeled and unlabeled examples. J Mach Learn Res 7:2399–2434
MathSciNet MATH Google Scholar
Blum A, Mitchell TM (1998) Combining labeled and unlabeled data with co-training. In: COLT
Chapelle O, Scholkopf B, Zien A (2009) Semi-supervised learning (Chapelle, O. et al., eds.; 2006) [book reviews]. IEEE Trans Neural Netw 20(3):542
Deng J, Dong W, Socher R, Li L-J, Li K, Fei-FL (2009) Imagenet: a large-scale hierarchical image database. In: 2009 IEEE conference on computer vision and pattern recognition, pp 248–255
Gal Y, Ghahramani Z (2016) Dropout as a Bayesian approximation: representing model uncertainty in deep learning. In: ICML
Golub GH, Van Der Vorst HA (2000) Eigenvalue computation in the 20th century. J Comput Appl Math 123(1–2):35–65
Article MathSciNet Google Scholar
Goodfellow IJ, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville AC, Bengio Y (2014) Generative adversarial nets. In: NIPS
Goodfellow IJ, Shlens J, Szegedy C (2015) Explaining and harnessing adversarial examples. CoRR, arXiv:1412.6572
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR), pp 770–778
Hoffer E, Ailon N (2016) Semi-supervised deep learning by metric embedding. ArXiv, arXiv:1611.01449
Hong C, Jun Y, Tao D, Meng W (2015) Image-based three-dimensional human pose recovery by multiview locality-sensitive sparse retrieval. IEEE Trans Ind Electron 62(6):3742–3751
Google Scholar
Kingma DP, Ba J (2014) Adam: a method for stochastic optimization. arXiv preprint. arXiv:1412.6980
Kingma DP, Mohamed S, Rezende DJ, Welling M (2014) Semi-supervised learning with deep generative models. In: NIPS
Krizhevsky A, Hinton G (2009) Learning multiple layers of features from tiny images, vol 1. University of Toronto, Technical report, p 7
Google Scholar
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: International conference on neural information processing systems
Laine S, Aila T (2016) Temporal ensembling for semi-supervised learning. arXiv preprint. arXiv:1610.02242
LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324
Article Google Scholar
Lee DH (2013) Pseudo-label: the simple and efficient semi-supervised learning method for deep neural networks. In: Workshop on challenges in representation learning, ICML, vol 3. p 2
Li M, Lv J, Wang J, Sang Y (2019) An abstract painting generation method based on deep generative model. In Neural Processing Letters, pp 1–12
MacKay DJC (2003) Information theory, inference, and learning algorithms. IEEE Trans Inf Theory 50:2544–2545
Google Scholar
Miyato T, Maeda SI, Koyama M, Ishii S (2017) Virtual adversarial training: a regularization method for supervised and semi-supervised learning. In: IEEE Trans Pattern Anal Mach Intell 99:1
Nair V, Hinton GE (2010) Rectified linear units improve restricted boltzmann machines. In: ICML
Netzer Y, Wang T, Coates A, Bissacco A, Wu B, Ng AY (2011) Reading digits in natural images with unsupervised feature learning
Ng YC, Colombo N, Silva R (2018) Bayesian semi-supervised learning with graph Gaussian processes
Nigam K, Mccallum ST, Mitchell T (2000) Text classification from labeled and unlabeled documents using EM. Mach Learn 39(2–3):103–134
Article Google Scholar
Oliver A, Odena A, Raffel CA, Cubuk ED, Goodfellow I (2018) Realistic evaluation of deep semi-supervised learning algorithms. In: Advances in neural information processing systems, pp 3235–3246
Park S, Park J-K, Shin S-J, Moon I-C (2018) Adversarial dropout for supervised and semi-supervised learning. In: AAAI
Paszke A, Gross S, Chintala S, Chanan G, Yang E, DeVito Z, Lin Z, Desmaison A, Antiga L, Lerer A (2017) Automatic differentiation in pytorch
Rasmus A, Berglund M, Honkala M, Valpola H, Raiko T (2015) Semi-supervised learning with ladder networks. In: NIPS
Salimans T, Kingma DP (2016) Weight normalization: a simple reparameterization to accelerate training of deep neural networks. In: Advances in neural information processing systems. pp 901–909
Shinoda S, Worrall DE, Brostow GJ (2017) Virtual adversarial ladder networks for semi-supervised learning. CoRR, arXiv:1711.07476
Sindhwani V, Niyogi P, Belkin M (2005) A co-regularization approach to semi-supervised learning with multiple views. In: ICML 2005
Springenberg JT (2016) Unsupervised and semi-supervised learning with categorical generative adversarial networks. CoRR, arXiv:1511.06390
Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15(1):1929–1958
MathSciNet MATH Google Scholar
Szegedy C, Liu W, Jia Y, Sermanet P, Reed SE, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: 2015 IEEE conference on computer vision and pattern recognition (CVPR), pp 1–9
Szegedy C, Zaremba W, Sutskever I, Bruna J, Erhan D, Goodfellow IJ, Fergus R (2014) Intriguing properties of neural networks. CoRR, arXiv:1312.6199
Tarvainen A, Valpola H (2018) Mean teachers are better role models: weight-averaged consistency targets improve semi-supervised deep learning results
Yarowsky D (1995) Unsupervised word sense disambiguation rivaling supervised methods. In: Proceedings of annual meeting of the association for computational linguistics, pp 189–196
Yu B, Wu J, Zhu Z (2018) Tangent-normal adversarial regularization for semi-supervised learning. CoRR, arXiv:1808.06088
Yu J, Yang X, Gao F, Tao D (2016) Deep multimodal distance metric learning using click constraints for image ranking. IEEE Trans Cybern 99:1–11
Yu JS, Rui Y, Tao D (2014) Click prediction for web image reranking using multimodal sparse coding. IEEE Trans Image Process 23:2019–2032
Article MathSciNet Google Scholar
Yu J, Tan M, Zhang H, Tao D, Rui Y (2019) Hierarchical deep click feature prediction for fine-grained image recognition. IEEE Trans Pattern Anal Mach Intell. https://doi.org/10.1109/TPAMI.2019.2932058
Article Google Scholar
Yu J, Zhu C, Zhang J, Huang Q, Tao D (2019) Spatial pyramid-enhanced NetVLAD with weighted triplet loss for place recognition. IEEE Trans Neural Netw Learn Sys. https://doi.org/10.1109/TNNLS.2019.2908982
Article Google Scholar
Zhang JW, Yu JS, Tao D (2018) Local deep-feature alignment for unsupervised dimension reduction. IEEE Trans Image Process 27:2420–2432
Article MathSciNet Google Scholar
Zhu X, Ghahramani Z (2002) Learning from labeled and unlabeled data. Tech Report 3175(2004):237–244
Google Scholar

Download references

Acknowledgements

The work was supported by the National Key R&D Program of China under Grant 2017YFC1501301, the Natural Science Foundation of China under Grants 61876219, 61503144, 61673188 and 61761130081, the Natural Science Foundation of Hubei Province of China under Grant 2017CFB519.

Author information

Authors and Affiliations

School of Automation, Wuhan University of Technology, Wuhan, 430074, China
Wei Zhou, Cheng Lian & Yixin Su
School of Artificial Intelligence and Automation, Huazhong University of Science and Technology, Wuhan, 430074, China
Zhigang Zeng

Authors

Wei Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Cheng Lian
View author publications
You can also search for this author in PubMed Google Scholar
Zhigang Zeng
View author publications
You can also search for this author in PubMed Google Scholar
Yixin Su
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Cheng Lian.

Ethics declarations

Conflict of interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhou, W., Lian, C., Zeng, Z. et al. Mutual Improvement Between Temporal Ensembling and Virtual Adversarial Training. Neural Process Lett 51, 1111–1124 (2020). https://doi.org/10.1007/s11063-019-10132-7

Download citation

Published: 24 October 2019
Issue Date: April 2020
DOI: https://doi.org/10.1007/s11063-019-10132-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Mutual Improvement Between Temporal Ensembling and Virtual Adversarial Training

Abstract

Access this article

Similar content being viewed by others

Machine Learning: Algorithms, Real-World Applications and Research Directions

Deep Learning: A Comprehensive Overview on Techniques, Taxonomy, Applications and Research Directions

Review of deep learning: concepts, CNN architectures, challenges, applications, future directions

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Mutual Improvement Between Temporal Ensembling and Virtual Adversarial Training

Abstract

Access this article

Similar content being viewed by others

Machine Learning: Algorithms, Real-World Applications and Research Directions

Deep Learning: A Comprehensive Overview on Techniques, Taxonomy, Applications and Research Directions

Review of deep learning: concepts, CNN architectures, challenges, applications, future directions

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation