Abstract
A deep learning neural network’s ultimate goal is to produce a model that does well on both the training data and the incoming data that it will use to make predictions. Overfitting is the term used to describe how successfully the prediction model created by the machine learning algorithm adapts to the training data. When a network is improperly adapted to a limited set of input data, overfitting occurs. In this scenario, the predictive model will be able to offer very strong predictions on the data in the training set, and will also capture the generalizable correlations and the noise produced by the data, but it will predict poorly on the data that it has not yet seen during his learning phase. Two methods to lessen or prevent overfitting are suggested in this publication among many others. Additionally, by examining dynamics during training, we propose a consensus classification approach that prevents overfitting, and we assess how well these two types of algorithms function in convolutional neural networks. Firstly, Early stopping makes it possible to store a model’s hyper-parameters when it’s appropriate. Additionally, the dropout makes learning the model more challenging. The fundamental concept behind dropout neural networks is to remove nodes to allow the network to focus on other features which reducing the model’s loss rate allows for gains of up to more than 50%. This study looked into the connection between node dropout regularization and the quality of randomness in terms of preventing neural networks from becoming overfit.
Supported by sabiss.net.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Hinton, G.E., Srivastava, N., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.R.: Improving neural networks by preventing co-adaptation of feature detectors, vol. 1 (2012)
Wu, J.-W., Chang, K.-Y., Fu, L.-C.: Adaptive under-sampling deep neural network for rapid and reliable image recovery in confocal laser scanning microscope measurements. IEEE Trans. Instrum. Meas. 71, 1–9 (2022). OCLC: 9359636331
Yingbin, B., Erkun, Y., Bo, H.: Understanding and Improving Early Stopping for Learning with Noisy Labels. OCLC: 1269561528 (2021)
Senen-Cerda, A., Sanders, J.: Almost sure convergence of dropout algorithms for neural networks. OCLC: 1144830913 (2020)
Liang, X., Wu, L., Li, J., Wang, Y., Meng, Q.: R-Drop: Regularized Dropout for Neural Networks. OCLC: 1269560920 (2021)
Shaeke, S., Xiuwen, L.: Overfitting Mechanism and Avoidance in Deep Neural Networks. OCLC: 1106327112 (2019)
Wei, C., Kakade, S., Ma, T.: The Implicit and Explicit Regularization Effects of Dropout. OCLC: 1228392785 (2020)
Arora, R., Bartlett, P., Mianjy, P., Srebro, N.: Dropout: Explicit Forms and Capacity Control. OCLC: 1228394951 (2020)
Cavazza, J., Morerio, P., Haeffele, B., Lane, C., Murino, V., Vidal, R.: Dropout as a low-rank regularizer for matrix factorization. In: Proceedings of the Twenty-First International Conference on Artificial Intelligence and Statistics, vol. 84, pp. 435–444 (2018). https://proceedings.mlr.press/v84/cavazza18a.html
IBM_Cloud_Education. What is Overfitting? (2021)
Brownlee, J.: How to Avoid Overfitting in Deep Learning Neural Networks (2018)
Maren, M., Lukas, B., Christoph, L., Philipp, H.: Early Stopping without a Validation Set. OCLC: 1106261430 (2017)
Moolayil, J.: Learn Keras for deep neural networks: a fast-track approach to modern deep learning with Python. OCLC: 1079007529 (2019)
Krizhevsky, A., Hinton, G.E., Sutskever, I.: ImageNet classification with deep convolutional neural networks. Commun. ACM 60(6), 84–90 (2017)
LeCun, Y., Hinton, G., Bengio, Y.: Deep learning. Nature 521(7553), 436–444 (2015). OCLC: 5831400088
Bengio, Y., Lamblin, P., Popovici, D., Larochelle, H.: Greedy layer-wise training of deep networks. In: Advances in Neural Information Processing Systems, vol. 19, pp. 153–160. Morgan Kaufmann Publishers, San Mateo (2007). OCLC: 181070563
Ng, A.Y.: Feature selection, L1 vs. L2 regularization, and rotational invariance. OCLC: 8876667046 (2004)
Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15, 1929–1958 (2014). OCLC: 5606582392
Sabiri, B., El Asri, B., Rhanoui, M.: Mechanism of overfitting avoidance techniques for training deep neural networks. In: Proceedings of the 24th International Conference on Enterprise Information Systems, pp. 418–427 (2022). https://www.scitepress.org/DigitalLibrary/Link.aspx?doi=10.5220/0011114900003179
Caruana, R., Lawrence, S., Giles, L.: Overfitting in neural nets. In: 14th Annual Neural Information Processing Systems Conference (2001). OCLC: 5574566588
Brownlee, J.: Develop deep learning models on Theano and TensorFlow using Keras. Machine Learning Mastery, Melbourne, Australia, vol. 1 (2017)
Larxel. Early Diabetes Classification (2021)
Cerisara, C., Caillon, P., Le Berre, G.: Unsupervised post-tuning of deep neural networks. In: IJCNN, Proceedings of the 2021 International Joint Conference on Neural Networks (IJCNN), Virtual Event, United States (2021)
Etzold, D.: MNIST—Dataset of Handwritten Digits (2022)
Sarvazyan, M.A.: Kaggle: Your Home for Data Science (2022)
Artificial intelligence in cancer: diagnostic to tailored treatment. OCLC: 1145585080
Iraji, M.S., Feizi-Derakhshi, M.-R., Tanha, J.: COVID-19 detection using deep convolutional neural networks. Complexity 2021, 1–10 (2021)
Lee, G., Park, H., Ryu, S., Lee, H.: Acceleration of DNN training regularization: dropout accelerator. In: 2020 International Conference on Electronics, Information, and Communication (ICEIC), pp. 1–2 (2020). https://ieeexplore.ieee.org/document/9051194/
Koivu, A., Kakko, J., Mäntyniemi, S., Sairanen, M.: Quality of randomness and node dropout regularization for fitting neural networks. Expert Syst. Appl. 207, 117938 (2022). https://linkinghub.elsevier.com/retrieve/pii/S0957417422011769
Li, C., Mao, Y., Zhang, R., Huai, J.: A revisit to MacKay algorithm and its application to deep network compression. Front. Comput. Sci. 14(4), 1–16 (2020). https://doi.org/10.1007/s11704-019-8390-z
Wang, Z., Fu, Y., Huang, T.S.: Deep learning through sparse and low-rank modeling. OCLC: 1097183504 (2019)
Sabiri, B., El Asri, B., Rhanoui, M.: Impact of hyperparameters on the generative adversarial networks behavior. In: Proceedings of the 24th International Conference on Enterprise Information Systems - Volume 1: ICEIS, pp. 428–438 (2022)
Acknowledgements
Thank you for the reviewers of ICEIS 2022 who devoted their precious time to read the initial version of this article. Their comments and suggestions allowed the continuous improvement of the content of this article.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Sabiri, B., EL Asri, B., Rhanoui, M. (2023). Efficient Deep Neural Network Training Techniques for Overfitting Avoidance. In: Filipe, J., Śmiałek, M., Brodsky, A., Hammoudi, S. (eds) Enterprise Information Systems. ICEIS 2022. Lecture Notes in Business Information Processing, vol 487. Springer, Cham. https://doi.org/10.1007/978-3-031-39386-0_10
Download citation
DOI: https://doi.org/10.1007/978-3-031-39386-0_10
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-39385-3
Online ISBN: 978-3-031-39386-0
eBook Packages: Computer ScienceComputer Science (R0)