Processing math: 16%
Gradient Descent with Low-Rank Objective Functions | IEEE Conference Publication | IEEE Xplore

Gradient Descent with Low-Rank Objective Functions


Abstract:

Several recent empirical studies demonstrate that important machine learning tasks, e.g., training deep neural networks, exhibit low-rank structure, where the loss functi...Show More

Abstract:

Several recent empirical studies demonstrate that important machine learning tasks, e.g., training deep neural networks, exhibit low-rank structure, where the loss function varies significantly in only a few directions of the input space. In this paper, we leverage such low-rank structure to reduce the high computational cost of canonical gradient-based methods such as gradient descent (GD). Our proposed Low-Rank Gradient Descent (LRGD) algorithm finds an \epsilon -minimizer of a p-dimensional function by first identifying r\leq p significant directions, and then estimating the true p-dimensional gradient at every iteration by computing directional derivatives only along those r directions. We establish that the “directional oracle complexity” of LRGD for strongly convex objective functions is \mathrm{O}(r\log(1/\epsilon)+rp). Therefore, when r\ll p, LRGD provides significant improvement over the known complexity of \mathcal{O}(p\log(1/\epsilon)) of GD in the strongly convex setting. Furthermore, using real and synthetic data, we empirically find that LRGD provides significant gains over GD when the data has low-rank structure, and in the absence of such structure, LRGD does not degrade performance compared to GD.
Date of Conference: 13-15 December 2023
Date Added to IEEE Xplore: 19 January 2024
ISBN Information:

ISSN Information:

Conference Location: Singapore, Singapore

Contact IEEE to Subscribe

References

References is not available for this document.