Hyperbolic Tangent Polynomial Parity Cyclic Learning Rate for Deep Neural Network

Lin, Hong; Yang, Xiaodong; Wu, Binyan; Xiong, Ruyan

doi:10.1007/978-3-030-89363-7_34

Hong Lin¹²,
Xiaodong Yang¹²,
Binyan Wu¹² &
…
Ruyan Xiong¹²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 13032))

Included in the following conference series:

Pacific Rim International Conference on Artificial Intelligence

1343 Accesses

Abstract

With the development of artificial intelligence technology, optimizing the performance of deep neural network model has become a hot issue in the field of scientific research. Learning rate is one of the most important hyper-parameters for model optimization. In recent years, some learning rate algorithms with cycle mechanism have been proposed. Most of them adopt warm restart and cycle mechanism to make the learning rate value cyclically change between two boundary values and prove their effectiveness by practicing in image classification task. In order to further improve the performance of neural network model and prove the effectiveness in different training task, the paper proposes a novel learning rate schedule called hyperbolic tangent polynomial parity cyclic learning rate (HTPPC), which adopts cycle mechanism and combines the advantages of warm restart and polynomial decay. In addition, the performance of HTPPC is demonstrated on image classification and object detection tasks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

An, W., Wang, H., Zhang, Y., Dai, Q.: Exponential decay sine wave learning rate for fast deep neural network training. In: IEEE Visual Communications and Image Processing, pp. 1–4 (2017)
Google Scholar
Bengio, Y.: Practical recommendations for gradient-based training of deep architectures. Arxiv (2012)
Google Scholar
Dauphin, Y., Pascanu, R., Gulcehre, C., Cho, K., Ganguli, S., Bengio, Y.: Identifying and attacking the saddle point problem in high-dimensional non-convex optimization. In: NIPS, vol. 27 (2014)
Google Scholar
Duchi, J., Hazan, E., Singer, Y.: Adaptive subgradient methods for online learning and stochastic optimization. J. Mach. Learn. Res. 12(7), 2121–2159 (2011)
MathSciNet MATH Google Scholar
Feng, Y., Li, Y.: An overview of deep learning optimization methods and learning rate attenuation methods. Hans J. Data Mining 8(3), 186–200 (2018)
Article Google Scholar
Fu, Q., et al.: Improving learning algorithm performance for spiking neural networks. In: 2017 IEEE 17th International Conference on Communication Technology (ICCT), pp. 1916–1919 (2017)
Google Scholar
Ge, R., Huang, F., Jin, C., Yuan, Y.: Escaping from saddle points – online stochastic gradient for tensor decomposition. Mathematics (2015)
Google Scholar
Goyal, P., et al.: Accurate, large minibatch SGD: training imagenet in 1 hour. Arxiv (2017)
Google Scholar
Hao, X., Zhang, G., Ma, S.: Deep learning. Int. J. Semant. Comput. 10(03), 417–439 (2016)
Article Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition, pp. 770–778. IEEE (2016)
Google Scholar
Hill, L., et al.: Geometric mean average precision, p. 1239. Springer, Heidelberg (2009)
Google Scholar
Jiao, J., Zhang, X., Li, F., Wang, Y.: A novel learning rate function and its application on the SVD++ recommendation algorithm. IEEE Access 8, 14112–14122 (2019)
Article Google Scholar
Kingma, D., Ba, J.: Adam: a method for stochastic optimization. In: International Conference on Learning Representations (2014)
Google Scholar
Li, J., Yang, X.: A cyclical learning rate method in deep learning training. In: 2020 International Conference on Computer, Information and Telecommunication Systems (CITS), pp. 1–5 (2020)
Google Scholar
Loshchilov, I., Hutter, F.: SGDR: stochastic gradient descent with warm restarts. In: ICLR 2017 (5th International Conference on Learning Representations) (2016)
Google Scholar
Luo, L., Xiong, Y., Liu, Y., Sun, X.: Adaptive gradient methods with dynamic bound of learning rate. In: International Conference on Learning Representations (ICLR) (2019)
Google Scholar
Ma, N., Zhang, X., Zheng, H.T., Sun, J.: Shufflenet v2: practical guidelines for efficient CNN architecture design. In: European Conference on Computer Vision (2018)
Google Scholar
Mehta, J.: In Search of Deeper Learning: The Quest to Remake the American High School (2019)
Google Scholar
Mishra, P., Sarawadekar, K.: Polynomial learning rate policy with warm restart for deep neural network. In: TENCON 2019 (2019)
Google Scholar
Polyak, B.: Some methods of speeding up the convergence of iteration methods. USSR Comput. Math. Math. Phys. 4(5), 1–17 (1964)
Article Google Scholar
Redmon, J., Farhadi, A.: Yolov3: an incremental improvement. arXiv e-prints (2018)
Google Scholar
Robbins, H., Monro, S.: A Stochastic Approximation Method. Herbert Robbins Selected Papers (1985)
Google Scholar
Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.C.: Mobilenetv 2: inverted residuals and linear bottlenecks. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2018)
Google Scholar
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556 (2014)
Smith, L.: Cyclical learning rates for training neural networks. In: 2017 IEEE Winter Conference on Applications of Computer Vision (WACV) (2017)
Google Scholar
Szegedy, C., et al.: Going deeper with convolutions. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1–9 (2015)
Google Scholar
Vicente, S., Carreira, J., Agapito, L., Batista, J.: Reconstructing pascal VOC. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 41–48 (2014)
Google Scholar
Wang, Y., Zhou, P., Zhong, W.: An optimization strategy based on hybrid algorithm of adam and SGD. In: MATEC Web of Conferences, vol. 232 (2018)
Google Scholar
Xiaohu, Y., Chen, G.A., Cheng, S.X.: Dynamic learning rate optimization of the backpropagation algorithm. IEEE Trans. Neural Netw. 6(03), 669–677 (1995)
Article Google Scholar

Download references

Author information

Authors and Affiliations

ZhejiangGongshang University, Hangzhou, 310018, China
Hong Lin, Xiaodong Yang, Binyan Wu & Ruyan Xiong

Authors

Hong Lin
View author publications
You can also search for this author in PubMed Google Scholar
Xiaodong Yang
View author publications
You can also search for this author in PubMed Google Scholar
Binyan Wu
View author publications
You can also search for this author in PubMed Google Scholar
Ruyan Xiong
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xiaodong Yang .

Editor information

Editors and Affiliations

MIMOS Berhad, Kuala Lumpur, Malaysia
Duc Nghia Pham
Sirindhorn International Institute of Science and Technology, Thammasat University, Mueang Pathum Thani, Thailand
Thanaruk Theeramunkong
Data61, CSIRO, Brisbane, QLD, Australia
Guido Governatori
Department of Philosophy, Tsinghua University, Beijing, China
Fenrong Liu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Lin, H., Yang, X., Wu, B., Xiong, R. (2021). Hyperbolic Tangent Polynomial Parity Cyclic Learning Rate for Deep Neural Network. In: Pham, D.N., Theeramunkong, T., Governatori, G., Liu, F. (eds) PRICAI 2021: Trends in Artificial Intelligence. PRICAI 2021. Lecture Notes in Computer Science(), vol 13032. Springer, Cham. https://doi.org/10.1007/978-3-030-89363-7_34

Download citation

DOI: https://doi.org/10.1007/978-3-030-89363-7_34
Published: 01 November 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-89362-0
Online ISBN: 978-3-030-89363-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics