Skip to main content

Efficient Deep Neural Network Training Techniques for Overfitting Avoidance

  • Conference paper
  • First Online:
Enterprise Information Systems (ICEIS 2022)

Abstract

A deep learning neural network’s ultimate goal is to produce a model that does well on both the training data and the incoming data that it will use to make predictions. Overfitting is the term used to describe how successfully the prediction model created by the machine learning algorithm adapts to the training data. When a network is improperly adapted to a limited set of input data, overfitting occurs. In this scenario, the predictive model will be able to offer very strong predictions on the data in the training set, and will also capture the generalizable correlations and the noise produced by the data, but it will predict poorly on the data that it has not yet seen during his learning phase. Two methods to lessen or prevent overfitting are suggested in this publication among many others. Additionally, by examining dynamics during training, we propose a consensus classification approach that prevents overfitting, and we assess how well these two types of algorithms function in convolutional neural networks. Firstly, Early stopping makes it possible to store a model’s hyper-parameters when it’s appropriate. Additionally, the dropout makes learning the model more challenging. The fundamental concept behind dropout neural networks is to remove nodes to allow the network to focus on other features which reducing the model’s loss rate allows for gains of up to more than 50%. This study looked into the connection between node dropout regularization and the quality of randomness in terms of preventing neural networks from becoming overfit.

Supported by sabiss.net.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Hinton, G.E., Srivastava, N., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.R.: Improving neural networks by preventing co-adaptation of feature detectors, vol. 1 (2012)

    Google Scholar 

  2. Wu, J.-W., Chang, K.-Y., Fu, L.-C.: Adaptive under-sampling deep neural network for rapid and reliable image recovery in confocal laser scanning microscope measurements. IEEE Trans. Instrum. Meas. 71, 1–9 (2022). OCLC: 9359636331

    Google Scholar 

  3. Yingbin, B., Erkun, Y., Bo, H.: Understanding and Improving Early Stopping for Learning with Noisy Labels. OCLC: 1269561528 (2021)

    Google Scholar 

  4. Senen-Cerda, A., Sanders, J.: Almost sure convergence of dropout algorithms for neural networks. OCLC: 1144830913 (2020)

    Google Scholar 

  5. Liang, X., Wu, L., Li, J., Wang, Y., Meng, Q.: R-Drop: Regularized Dropout for Neural Networks. OCLC: 1269560920 (2021)

    Google Scholar 

  6. Shaeke, S., Xiuwen, L.: Overfitting Mechanism and Avoidance in Deep Neural Networks. OCLC: 1106327112 (2019)

    Google Scholar 

  7. Wei, C., Kakade, S., Ma, T.: The Implicit and Explicit Regularization Effects of Dropout. OCLC: 1228392785 (2020)

    Google Scholar 

  8. Arora, R., Bartlett, P., Mianjy, P., Srebro, N.: Dropout: Explicit Forms and Capacity Control. OCLC: 1228394951 (2020)

    Google Scholar 

  9. Cavazza, J., Morerio, P., Haeffele, B., Lane, C., Murino, V., Vidal, R.: Dropout as a low-rank regularizer for matrix factorization. In: Proceedings of the Twenty-First International Conference on Artificial Intelligence and Statistics, vol. 84, pp. 435–444 (2018). https://proceedings.mlr.press/v84/cavazza18a.html

  10. IBM_Cloud_Education. What is Overfitting? (2021)

    Google Scholar 

  11. Brownlee, J.: How to Avoid Overfitting in Deep Learning Neural Networks (2018)

    Google Scholar 

  12. Maren, M., Lukas, B., Christoph, L., Philipp, H.: Early Stopping without a Validation Set. OCLC: 1106261430 (2017)

    Google Scholar 

  13. Moolayil, J.: Learn Keras for deep neural networks: a fast-track approach to modern deep learning with Python. OCLC: 1079007529 (2019)

    Google Scholar 

  14. Krizhevsky, A., Hinton, G.E., Sutskever, I.: ImageNet classification with deep convolutional neural networks. Commun. ACM 60(6), 84–90 (2017)

    Article  Google Scholar 

  15. LeCun, Y., Hinton, G., Bengio, Y.: Deep learning. Nature 521(7553), 436–444 (2015). OCLC: 5831400088

    Google Scholar 

  16. Bengio, Y., Lamblin, P., Popovici, D., Larochelle, H.: Greedy layer-wise training of deep networks. In: Advances in Neural Information Processing Systems, vol. 19, pp. 153–160. Morgan Kaufmann Publishers, San Mateo (2007). OCLC: 181070563

    Google Scholar 

  17. Ng, A.Y.: Feature selection, L1 vs. L2 regularization, and rotational invariance. OCLC: 8876667046 (2004)

    Google Scholar 

  18. Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15, 1929–1958 (2014). OCLC: 5606582392

    Google Scholar 

  19. Sabiri, B., El Asri, B., Rhanoui, M.: Mechanism of overfitting avoidance techniques for training deep neural networks. In: Proceedings of the 24th International Conference on Enterprise Information Systems, pp. 418–427 (2022). https://www.scitepress.org/DigitalLibrary/Link.aspx?doi=10.5220/0011114900003179

  20. Caruana, R., Lawrence, S., Giles, L.: Overfitting in neural nets. In: 14th Annual Neural Information Processing Systems Conference (2001). OCLC: 5574566588

    Google Scholar 

  21. Brownlee, J.: Develop deep learning models on Theano and TensorFlow using Keras. Machine Learning Mastery, Melbourne, Australia, vol. 1 (2017)

    Google Scholar 

  22. Larxel. Early Diabetes Classification (2021)

    Google Scholar 

  23. Cerisara, C., Caillon, P., Le Berre, G.: Unsupervised post-tuning of deep neural networks. In: IJCNN, Proceedings of the 2021 International Joint Conference on Neural Networks (IJCNN), Virtual Event, United States (2021)

    Google Scholar 

  24. Etzold, D.: MNIST—Dataset of Handwritten Digits (2022)

    Google Scholar 

  25. Sarvazyan, M.A.: Kaggle: Your Home for Data Science (2022)

    Google Scholar 

  26. Artificial intelligence in cancer: diagnostic to tailored treatment. OCLC: 1145585080

    Google Scholar 

  27. Iraji, M.S., Feizi-Derakhshi, M.-R., Tanha, J.: COVID-19 detection using deep convolutional neural networks. Complexity 2021, 1–10 (2021)

    Article  Google Scholar 

  28. Lee, G., Park, H., Ryu, S., Lee, H.: Acceleration of DNN training regularization: dropout accelerator. In: 2020 International Conference on Electronics, Information, and Communication (ICEIC), pp. 1–2 (2020). https://ieeexplore.ieee.org/document/9051194/

  29. Koivu, A., Kakko, J., Mäntyniemi, S., Sairanen, M.: Quality of randomness and node dropout regularization for fitting neural networks. Expert Syst. Appl. 207, 117938 (2022). https://linkinghub.elsevier.com/retrieve/pii/S0957417422011769

  30. Li, C., Mao, Y., Zhang, R., Huai, J.: A revisit to MacKay algorithm and its application to deep network compression. Front. Comput. Sci. 14(4), 1–16 (2020). https://doi.org/10.1007/s11704-019-8390-z

    Article  Google Scholar 

  31. Wang, Z., Fu, Y., Huang, T.S.: Deep learning through sparse and low-rank modeling. OCLC: 1097183504 (2019)

    Google Scholar 

  32. Sabiri, B., El Asri, B., Rhanoui, M.: Impact of hyperparameters on the generative adversarial networks behavior. In: Proceedings of the 24th International Conference on Enterprise Information Systems - Volume 1: ICEIS, pp. 428–438 (2022)

    Google Scholar 

Download references

Acknowledgements

Thank you for the reviewers of ICEIS 2022 who devoted their precious time to read the initial version of this article. Their comments and suggestions allowed the continuous improvement of the content of this article.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Bihi Sabiri .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Sabiri, B., EL Asri, B., Rhanoui, M. (2023). Efficient Deep Neural Network Training Techniques for Overfitting Avoidance. In: Filipe, J., Śmiałek, M., Brodsky, A., Hammoudi, S. (eds) Enterprise Information Systems. ICEIS 2022. Lecture Notes in Business Information Processing, vol 487. Springer, Cham. https://doi.org/10.1007/978-3-031-39386-0_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-39386-0_10

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-39385-3

  • Online ISBN: 978-3-031-39386-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics