Abstract
Proportional integral derivative (PID) optimizers have shown superiority in alleviating the oscillation problem suffered by stochastic gradient descent with momentum (SGD-M). To restrain high-frequency noises caused by minibatch data, the existing PID optimizers utilized the filtered gradient difference as D term, which slows the response and may influence convergence performance. In this paper, a new adaptive PID optimizer is proposed without using any filter. The optimizer combines present gradient (P), momentum item (I), and improved gradient difference term (D). The improved D term is obtained by imposing an adaptive saturation function on gradient difference, which can suppress oscillation and high-frequency noises. Furthermore, that function has an adaptive magnitude related to PI term, well balancing the contributions of PI and D terms. As a result, the proposed adaptive PID optimizer can reduce the oscillation phenomena, and achieves up to 32% acceleration with competitive accuracy, which is demonstrated by experiments on three commonly used benchmark datasets with different scales.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–444 (2015)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Redmon, J., Farhadi, A.: Yolo9000: better, faster, stronger. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7263–7271 (2017)
Hinton, G., et al.: Deep neural networks for acoustic modeling in speech recognition: the shared views of four research groups. IEEE Sig. Process. Mag. 29(6), 82–97 (2012)
Qian, N.: On the momentum term in gradient descent learning algorithms. Neural Netw. 12(1), 145–151 (1999)
An, W., Wang, H., Sun, Q., Xu, J., Dai, Q., Zhang, L.: A PID controller approach for stochastic optimization of deep networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8522–8531 (2018)
Wang, H., Luo, Y., An, W., Sun, Q., Jun, X., Zhang, L.: PID controller-based stochastic optimization acceleration for deep neural networks. IEEE Trans. Neural Netw. Learn. Syst. 31(12), 5079–5091 (2020)
Shi, L., Zhang, Y., Wang, W., Cheng, J., Lu, H.: Rethinking the PID optimizer for stochastic optimization of deep networks. In: 2020 IEEE International Conference on Multimedia and Expo (ICME), pp. 1–6. IEEE (2020)
Wang, D., Ji, M., Wang, Y., Wang, H., Fang, L.: SPI-optimizer: an integral-separated PI controller for stochastic optimization. In: 2019 IEEE International Conference on Image Processing (ICIP), pp. 2129–2133. IEEE (2019)
LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)
Krizhevsky, A., Hinton, G., et al.: Learning multiple layers of features from tiny images (2009)
Huang, G., Liu, Z., Van Der Maaten, L., Weinberger, K.Q.: Densely connected convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4700–4708 (2017)
He, K., Zhang, X., Ren, S., Sun, J.: Identity mappings in deep residual networks. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9908, pp. 630–645. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46493-0_38
Zagoruyko, S., Komodakis, N.: Wide residual networks. arXiv preprint arXiv:1605.07146 (2016)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Tang, W., Zhao, Y., Xie, W., Huang, W. (2021). A Novel Adaptive PID Optimizer of Deep Neural Networks. In: Mantoro, T., Lee, M., Ayu, M.A., Wong, K.W., Hidayanto, A.N. (eds) Neural Information Processing. ICONIP 2021. Communications in Computer and Information Science, vol 1516. Springer, Cham. https://doi.org/10.1007/978-3-030-92307-5_59
Download citation
DOI: https://doi.org/10.1007/978-3-030-92307-5_59
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-92306-8
Online ISBN: 978-3-030-92307-5
eBook Packages: Computer ScienceComputer Science (R0)