Loading [a11y]/accessibility-menu.js
Projective Fisher Information for Natural Gradient Descent | IEEE Journals & Magazine | IEEE Xplore

Projective Fisher Information for Natural Gradient Descent


Impact Statement:Deep Neural Networks achieve state ofthe art in various deep learning problems like computer visionand speech recognition. However, they are challenging to train,require ...Show More

Abstract:

Improvements in neural network optimization algorithms have enabled shorter training times and the ability to reach state-of-the-art performance on various machine learni...Show More
Impact Statement:
Deep Neural Networks achieve state ofthe art in various deep learning problems like computer visionand speech recognition. However, they are challenging to train,require extensive hyperparameter tuning, and take substantialtraining time running into many weeks. Hence it is important toease the training by reducing the training time and amount ofparameter tuning required. Natural gradient-based optimizationmethods reduce the training time and need for extensive tuningbut are too complex for many networks. The training and analysisalgorithm introduced here reduces the amount of time andeffort required in performing the training significantly while notimpacting the performance metric achieved at the given task.

Abstract:

Improvements in neural network optimization algorithms have enabled shorter training times and the ability to reach state-of-the-art performance on various machine learning tasks. Fisher information based natural gradient descent is one such second-order method that improves the convergence speed and the final performance metric achieved for many machine learning algorithms. Fisher information matrices are also helpful to analyze the properties and expected behavior of neural networks. However, natural gradient descent is a high complexity method due to the need to maintain and invert covariance matrices. This is especially the case with modern deep neural networks, which have a very high number of parameters, and for which the problem often becomes computationally unfeasible. We suggest using the Fisher information for analysis of parameter space of fully connected and convolutional neural networks without calculating the matrix itself. We also propose a lower complexity natural gradi...
Published in: IEEE Transactions on Artificial Intelligence ( Volume: 4, Issue: 2, April 2023)
Page(s): 304 - 314
Date of Publication: 25 February 2022
Electronic ISSN: 2691-4581

Funding Agency:


Contact IEEE to Subscribe

References

References is not available for this document.