Abstract:
Deep neural networks (DNNs) have well-documented merits in learning nonlinear functions in high-dimensional spaces. Stochastic gradient descent (SGD)-type optimization al...Show MoreMetadata
Abstract:
Deep neural networks (DNNs) have well-documented merits in learning nonlinear functions in high-dimensional spaces. Stochastic gradient descent (SGD)-type optimization algorithms are the ‘workhorse’ for training DNNs. Nonetheless, such algorithms often suffer from slow convergence, sizable fluctuations, and abundant local solutions, to name a few. In this context, the present paper draws ideas from adaptive control of dynamical systems, and develops an adaptive proportional-integral-derivative (AdaPID) solver for fast, stable, and effective training of DNNs. AdaPID relies on second-order moment estimates of gradients to adaptively adjust the PID coefficients. Numerical tests corroborate the merits of AdaPID on several tasks such as image generation using generative adversarial networks (GANs) and image classification using convolutional neural networks (CNNs) as well as long-short term memories (LSTMs).
Published in: ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
Date of Conference: 23-27 May 2022
Date Added to IEEE Xplore: 27 April 2022
ISBN Information: