Conferences >2022 IEEE International Confe...

MBAG: A Scalable Mini-Block Adaptive Gradient Method for Deep Neural Networks

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

Preconditioning is a technique widely used to accelerate the convergence of optimization algorithms. Recently proposed efficient second-order algorithms (such as KFAC) sh...Show More

Metadata

Abstract:

Preconditioning is a technique widely used to accelerate the convergence of optimization algorithms. Recently proposed efficient second-order algorithms (such as KFAC) showed that preconditioning the gradient using the curvature information of loss function can help achieve faster convergence. However, their practicality in large-scale deep learning is still limited due to the high computational and storage cost. In this work, we propose a stochastic adaptive gradient algorithm, called Mini-Block Adaptive Gradient (MBAG), that addresses those computational challenges in computing the preconditioning matrix. To reduce the per-iteration cost, MBAG analytically computes the inverse of preconditioning matrix using the matrix inversion lemma and then approximately finds its square root using an iterative solver. Further, to mitigate the storage requirement, MBAG partitions model parameters into subsets of small size and only computes sub-blocks of preconditioner associated with each subset of parameters. This greatly improves the scalability of the proposed algorithm. The performance of MBAG is compared to that of popular first- and second-order algorithms on auto-encoder and classification tasks using real datasets.

Published in: 2022 IEEE International Conference on Big Data (Big Data)

Date of Conference: 17-20 December 2022

Date Added to IEEE Xplore: 26 January 2023

ISBN Information:

DOI: 10.1109/BigData55660.2022.10020262

Conference Location: Osaka, Japan

Funding Agency:

Contents

References is not available for this document.

MBAG: A Scalable Mini-Block Adaptive Gradient Method for Deep Neural Networks

Abstract:

Metadata

Abstract:

Funding Agency:

References

IEEE Account

Purchase Details

Profile Information

Need Help?

MBAG: A Scalable Mini-Block Adaptive Gradient Method for Deep Neural Networks

Alerts

Abstract:

Metadata

Abstract:

Funding Agency:

References

IEEE Account

Purchase Details

Profile Information

Need Help?