A Regularization Post Layer: An Additional Way How to Make Deep Neural Networks Robust

Vaněk, Jan; Zelinka, Jan; Soutner, Daniel; Psutka, Josef

doi:10.1007/978-3-319-68456-7_17

Jan Vaněk¹⁶,
Jan Zelinka¹⁶,
Daniel Soutner¹⁶ &
…
Josef Psutka¹⁶

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10583))

Included in the following conference series:

International Conference on Statistical Language and Speech Processing

898 Accesses
13 Citations

Abstract

Neural Networks (NNs) are prone to overfitting. Especially, the Deep Neural Networks in the cases where the training data are not abundant. There are several techniques which allow us to prevent the overfitting, e.g., L1/L2 regularization, unsupervised pre-training, early training stopping, dropout, bootstrapping or cross-validation models aggregation. In this paper, we proposed a regularization post-layer that may be combined with prior techniques, and it brings additional robustness to the NN. We trained the regularization post-layer in the cross-validation (CV) aggregation scenario: we used the CV held-out folds to train an additional neural network post-layer that boosts the network robustness. We have tested various post-layer topologies and compared results with other regularization techniques. As a benchmark task, we have selected the TIMIT phone recognition which is a well-known and still favorite task where the training data are limited, and the used regularization techniques play a key role. However, the regularization post-layer is a general method, and it may be employed in any classification task.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

A survey of regularization strategies for deep models

Article 05 December 2019

Deep Neural Network Ensembles

Deep Bilevel Learning

References

Girosi, F., Jones, M., Poggio, T.: Regularization theory and neural networks architectures. Neural Comput. 7(2), 219–269 (1995)
Article Google Scholar
Srivastava, N., Hinton, G.E., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. (JMLR) 15, 1929–1958 (2014)
MathSciNet MATH Google Scholar
Wang, S.I., Manning, C.D.: Fast dropout training. In: Proceedings of the 30th International Conference on Machine Learning, vol. 28, pp. 118–126 (2013)
Google Scholar
Kang, G., Li, J., Tao, D.: Shakeout: a new regularized deep neural network training scheme. In: Proceedings of the AAAI Conference, pp. 1751–1757 (2016)
Google Scholar
Hinton, G.E., Osindero, S., Teh, Y.W.: A fast learning algorithm for deep belief nets. Neural Comput. 18(7), 1527–1554 (2006)
Article MathSciNet MATH Google Scholar
Larochelle, H., Bengio, Y., Louradour, J., Lamblin, P.: Exploring strategies for training deep neural networks. J. Mach. Learn. Res. (JMLR) 1, 1–40 (2009)
MATH Google Scholar
Hansen, L.K., Salamon, P.: Neural network ensembles. IEEE Trans. Pattern Anal. Mach. Intell. 12(10), 993–1001 (1990)
Article Google Scholar
Zhang, G.: Neural networks for classification: a survey. IEEE Trans. Syst. Man Cybern. 30(4), 451–462 (2000)
Article Google Scholar
Zhou, Z.H., Wu, J., Tang, W.: Ensembling neural networks: many could be better than all. Artif. Intell. 137(1–2), 239–263 (2002)
Article MathSciNet MATH Google Scholar
Perrone, M.P., Cooper, L.N.: When networks disagree: ensemble methods for hybrid neural networks. Technical report, DTIC Document (1992)
Google Scholar
Breiman, L.: Bagging predictors. Mach. Learn. 24(2), 123–140 (1996)
MATH Google Scholar
Barrow, D.K., Crone, S.F.: Crogging (cross-validation aggregation) for forecasting - a novel algorithm of neural network ensembles on time series subsamples. In: Proceedings of the International Joint Conference on Neural Networks (2013)
Google Scholar
Seltzer, M.L., Yu, D., Wang, Y.: An investigation of deep neural networks for noise robust speech recognition. In: Proceedings of the ICASSP (2013)
Google Scholar
Vesely, K., Ghoshal, A., Burget, L., Povey, D.: Sequence-discriminative training of deep neural networks. In: Proceedings of the INTERSPEECH, pp. 2345–2349 (2013)
Google Scholar
Moon, T., Choi, H., Lee, H., Song, I.: RNNDROP: a novel dropout for RNNs in ASR. In: Proceedings of the ASRU (2015)
Google Scholar
Tóth, L.: Convolutional deep maxout networks for phone recognition. In: Proceedings of the INTERSPEECH, pp. 1078–1082 (2014)
Google Scholar
Tóth, L.: Combining time- and frequency-domain convolution in convolutional neural network-based phone recognition. In: Proceedings of the ICASSP (2014)
Google Scholar

Download references

Acknowledgments

This research was supported by the Grant Agency of the Czech Republic, project No. GAČR GBP103/12/G084.

Author information

Authors and Affiliations

Department of Cybernetics, New Technologies for the Information Society, University of West Bohemia in Pilsen, Univerzitní 22, 301 00, Pilsen, Czech Republic
Jan Vaněk, Jan Zelinka, Daniel Soutner & Josef Psutka

Authors

Jan Vaněk
View author publications
You can also search for this author in PubMed Google Scholar
Jan Zelinka
View author publications
You can also search for this author in PubMed Google Scholar
Daniel Soutner
View author publications
You can also search for this author in PubMed Google Scholar
Josef Psutka
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jan Vaněk .

Editor information

Editors and Affiliations

University of Le Mans, Le Mans, France
Nathalie Camelin
University of Le Mans, Le Mans, France
Yannick Estève
Rovira i Virgili University, Tarragona, Spain
Carlos Martín-Vide

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Vaněk, J., Zelinka, J., Soutner, D., Psutka, J. (2017). A Regularization Post Layer: An Additional Way How to Make Deep Neural Networks Robust. In: Camelin, N., Estève, Y., Martín-Vide, C. (eds) Statistical Language and Speech Processing. SLSP 2017. Lecture Notes in Computer Science(), vol 10583. Springer, Cham. https://doi.org/10.1007/978-3-319-68456-7_17

Download citation

DOI: https://doi.org/10.1007/978-3-319-68456-7_17
Published: 27 September 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-68455-0
Online ISBN: 978-3-319-68456-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics