Using Hessians as a Regularization Technique

Rahimi, Adel; Kodliuk, Tetiana; Benchekroun, Othman

doi:10.1007/978-3-030-64583-0_4

Adel Rahimi¹⁶,
Tetiana Kodliuk¹⁶ &
Othman Benchekroun¹⁶

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 12565))

Included in the following conference series:

International Conference on Machine Learning, Optimization, and Data Science

1697 Accesses

Abstract

In this paper we present a novel, yet simple, method to regularize the optimization of neural networks using second order derivatives. In the proposed method, we calculate the Hessians of the last n layers of a neural network, then re-initialize the top k percent using the absolute value. This method has shown an increase in our efficiency to reach a better loss function minimum. The results show that this method offers a significant improvement over the baseline and helps the optimizer converge faster.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Deep Learning Optimization

Regularisation of neural networks by enforcing Lipschitz continuity

Article Open access 06 December 2020

A New Conjugate Gradient Method with Smoothing $L_{1/2} $ Regularization Based on a Modified Secant Equation for Training Neural Networks

Article 21 November 2017

References

Rumelhart, D.E., Hinton, G.E., Williams, R.J.: Learning representations by back-propagating errors. Nature 323(6088), 533–536 (1986)
Article Google Scholar
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
Tieleman, T., Hinton, G.: Lecture 6.5-rmsprop: Divide the gradient by a running average of its recent magnitude. COURSERA Neural Netw. Mach. Learn. 4(2), 26–31 (2012)
Google Scholar
Robbins, H., Monro, S.: A stochastic approximation method. Annal. Math. Stat. 22, 400–407 (1951)
Article MathSciNet Google Scholar
Glorot, X., Bengio, Y.: Understanding the difficulty of training deep feedforward neural networks. In: Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, pp. 249–256 (2010)
Google Scholar
LeCun, Y., Cortes, C.: MNIST handwritten digit database (2010)
Google Scholar

Download references

Author information

Authors and Affiliations

Dathena Science Pte. Ltd., #07-02, 1 George St., Singapore, 049145, Singapore
Adel Rahimi, Tetiana Kodliuk & Othman Benchekroun

Authors

Adel Rahimi
View author publications
You can also search for this author in PubMed Google Scholar
Tetiana Kodliuk
View author publications
You can also search for this author in PubMed Google Scholar
Othman Benchekroun
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Adel Rahimi .

Editor information

Editors and Affiliations

University of Catania, Catania, Italy
Giuseppe Nicosia
University of Reading, Reading, UK
Varun Ojha
University of Oxford, Oxford, UK
Emanuele La Malfa
University of Cambridge, Cambridge, UK
Giorgio Jansen
Almawave, Rome, Italy
Vincenzo Sciacca
University of Florida, Gainesville, FL, USA
Panos Pardalos
University of Catania, Catania, Italy
Giovanni Giuffrida
Harvard University, Cambridge, MA, USA
Renato Umeton

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Rahimi, A., Kodliuk, T., Benchekroun, O. (2020). Using Hessians as a Regularization Technique. In: Nicosia, G., et al. Machine Learning, Optimization, and Data Science. LOD 2020. Lecture Notes in Computer Science(), vol 12565. Springer, Cham. https://doi.org/10.1007/978-3-030-64583-0_4

Download citation

DOI: https://doi.org/10.1007/978-3-030-64583-0_4
Published: 08 January 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-64582-3
Online ISBN: 978-3-030-64583-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Using Hessians as a Regularization Technique

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Deep Learning Optimization

Regularisation of neural networks by enforcing Lipschitz continuity

A New Conjugate Gradient Method with Smoothing \(L_{1/2} \) Regularization Based on a Modified Secant Equation for Training Neural Networks

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Using Hessians as a Regularization Technique

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Deep Learning Optimization

Regularisation of neural networks by enforcing Lipschitz continuity

A New Conjugate Gradient Method with Smoothing \(L_{1/2} \) Regularization Based on a Modified Secant Equation for Training Neural Networks

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation