research-article

The Impact of Learning Rate Decay and Periodical Learning Rate Restart on Artificial Neural Network

Author:

Yimin DingAuthors Info & Claims

AIEE '21: Proceedings of the 2021 2nd International Conference on Artificial Intelligence in Electronics Engineering

Pages 6 - 14

https://doi.org/10.1145/3460268.3460270

Published: 30 July 2021 Publication History

AIEE '21: Proceedings of the 2021 2nd International Conference on Artificial Intelligence in Electronics Engineering

The Impact of Learning Rate Decay and Periodical Learning Rate Restart on Artificial Neural Network

Pages 6 - 14

Abstract
References

Abstract

There is no denying that learning rate is one of the most important hyper-parameter for model training. In this paper, two typical strategies, namely learning rate decay and periodical learning rate restart are tested in artificial neural networks (ANN) and compared with the fixed learning rate. Experiments demonstrate that learning rate adjustment strategies surpass fixed learning rate in model training, including fast convergence, high validation accuracy and low training loss. Besides, periodical learning rate restart strategy tends to take fewer epochs than learning rate decay to get the same accuracy. Thus, increasing the learning rate appropriately will better fit the model and achieve excellent performance.

References

[1]

Gongpeng W, Meng D, Changyong N. Stochastic gradient descent algorithm based on convolution neural network. Computer Engineering and Design. 2018; 39(2): 441-445+462.

[2]

Sebastian R. An overview of gradient descent optimization algorithms. arXiv preprint arXiv:1609.04747, 2017.

[3]

Hafidz Z. First off, what is a learning rate? towards data science. https://towardsdatascience.com/understanding-learning-rates-and-how-it-improves-performance-in-deep-learning-d0d4059c1c10. Published Jan 22, 2018. Accessed Aug 24, 2020.

[4]

Vitaly B. Introduction. techburst.io. https://techburst.io/improving-the-way-we-work-with-learning-rate-5e99554f163b. Published Nov 17, 2017. Accessed Aug 24, 2020.

[5]

Leslie N. S. Cyclical Learning Rates for Training Neural Networks. arXiv preprint arXiv:1506.01186, 2017.

[6]

Ilya L, Frank H. SGDR: STOCHASTIC GRADIENT DESCENT WITH WARM RESTARTS. arXiv preprint arXiv:1608.03983, 2017.

[7]

Aitor L, Yasaman B, Ethan D, Jascha SD, Guy GA. The large learning rate phase of deep learning: the catapult mechanism. arXiv preprint arXiv:2003.02218, 2020.

[8]

Paulo C, Antonio C, Fernando A, Telmo M, Jose R. Modeling wine preferences by data mining from physicochemical properties. Decision Support Syst. 2009; 47(4): 547-553.

[9]

Yuxu F, Yumei L. An Overview of Deep Learning Optimization Methods and Learning Rate Attenuation Methods. Hans Journal of Data Mining. 2018; 8(4): 186-200.

[10]

Kenstler, B. Cyclical Learning Rate (CLR). github. https://github.com/bckenstler/CLR. Published Mar 24, 2017. Accessed Aug 19, 2020.

[11]

Ian G, Yoshua B, Aaron C. Optimization for Training Deep Models. Deep Learning. Cambridge, Massachusetts: MIT Press; 2016. http://www.deeplearningbook.org. Accessed Aug 24, 2020.

[12]

Dominic M, Carlo L. REVISITING SMALL BATCH TRAINING FOR DEEP NUERAL NETWORKS. arXiv preprint arXiv:1804.07612, 2018.

[13]

Dua, D, Graff, C. University of California, School of Information and Computer Science. archive.ics.uci.edu. http://archive.ics.uci.edu/ml. Accessed Aug 19, 2020.

[14]

Babysitting the learning process. cs231n.github.io.https://cs231n.github.io/neural-networks-3/#baby. Accessed Aug 24, 2020.

[15]

Wenbin J, Jing P, Geyan Y. Research on adaptive learning rate algorithm in deep learning. Journal of Huazhong University of Science and Technology (Nature Science Edition). 2019; 47(5): 79-83.

[16]

Leslie N.S, Nicholay T. Super-Convergence: Very Fast Training of Neural Networks Using Large Learning Rates. arXiv preprint arXiv:1708.07120, 2018.

[17]

Jia W, Senpeng C, Xiuyun C, Rui Z. Reinforcement Learning for Model Selection and Hyperparameter Optimization. Journal of University of Electronic Science and Technology of China. 2020; 49(2): 255-261.

[18]

Zakaria M, Wouter M.K, Tim van E. Lipschitz Adaptivity with Multiple Learning Rates in Online Learning. arXiv preprint arXiv:1902.10797, 2019.

[19]

Tomer L, Selcuk K. Automatic and Simultaneous Adjustment of Learning Rate and Momentum for Stochastic Gradient Descent. arXiv preprint arXiv:1908.07607, 2019.

[20]

Nitish SK, Dheevatsa M, Jorge N, Mikhail S, Ping TPT. ON LARGE-BATCH TRAINING FOR DEEP LEARNING GENERALIZATION GAP AND SHARP MINIMA. arXiv preprint arXiv:1609.04836, 2017.

[21]

Anand S. The Cyclical Learning Rate technique. teleported.in. http://teleported.in/posts/cyclic-learning-rate. Published Nov 12, 2017. Accessed Aug 24, 2020.

[22]

SGDRScheduler (Callback). github. https://github.com/keras-team/keras/pull/3525/files#diff-50c97c4dfd2a0d293beded45a6b7b5e885 1aa5e69e424da757e329c71abac65d. Accessed Aug 24, 2020.

Cited By

Raja ESoni BLalrempuii CBorgohain S(2024)An adaptive cyclical learning rate based hybrid model for Dravidian fake news detectionExpert Systems with Applications: An International Journal10.1016/j.eswa.2023.122768241:COnline publication date: 25-Jun-2024
https://dl.acm.org/doi/10.1016/j.eswa.2023.122768
Adducul CMacasaet JTiglao N(2023)Edge-based Battery Remaining Useful Life Estimation Using Deep Learning2023 International Conference on Smart Applications, Communications and Networking (SmartNets)10.1109/SmartNets58706.2023.10215733(1-6)Online publication date: 25-Jul-2023
https://doi.org/10.1109/SmartNets58706.2023.10215733
Bagrodia AU. AM.L.J. S(2023)Pathological Lung Segmentation in Chest X-Ray Images using Modified U-Net-Based Architecture2023 IEEE 20th India Council International Conference (INDICON)10.1109/INDICON59947.2023.10440904(452-456)Online publication date: 14-Dec-2023
https://doi.org/10.1109/INDICON59947.2023.10440904
Show More Cited By

Recommendations

Learning Rate Perturbation: A Generic Plugin of Learning Rate Schedule towards Flatter Local Minima
CIKM '22: Proceedings of the 31st ACM International Conference on Information & Knowledge Management

Learning rate is one of the most important hyper-parameters that has significant influence for neural network training. Learning rate schedules are widely used in real practice to adjust the learning rate according to pre-defined schedules for the fast ...
Explanation-Based Neural Network Learning: A Lifelong Learning Approach
Artificial Neural Network Learning: A Comparative Review
SETN '02: Proceedings of the Second Hellenic Conference on AI: Methods and Applications of Artificial Intelligence

Various neural learning procedures have been proposed by different researchers in order to adapt suitable controllable parameters of neural network architectures. These can be from simple Hebbian procedures to complicated algorithms applied to ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

AIEE '21: Proceedings of the 2021 2nd International Conference on Artificial Intelligence in Electronics Engineering

January 2021

102 pages

ISBN:9781450389273

DOI:10.1145/3460268

Copyright © 2021 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 30 July 2021

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

AIEE 2021

AIEE 2021: 2021 2nd International Conference on Artificial Intelligence in Electronics Engineering

January 15 - 17, 2021

Phuket, Thailand

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

10
Total Citations
View Citations
195
Total Downloads

Downloads (Last 12 months)69
Downloads (Last 6 weeks)11

Reflects downloads up to 03 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Raja ESoni BLalrempuii CBorgohain S(2024)An adaptive cyclical learning rate based hybrid model for Dravidian fake news detectionExpert Systems with Applications: An International Journal10.1016/j.eswa.2023.122768241:COnline publication date: 25-Jun-2024
https://dl.acm.org/doi/10.1016/j.eswa.2023.122768
Adducul CMacasaet JTiglao N(2023)Edge-based Battery Remaining Useful Life Estimation Using Deep Learning2023 International Conference on Smart Applications, Communications and Networking (SmartNets)10.1109/SmartNets58706.2023.10215733(1-6)Online publication date: 25-Jul-2023
https://doi.org/10.1109/SmartNets58706.2023.10215733
Bagrodia AU. AM.L.J. S(2023)Pathological Lung Segmentation in Chest X-Ray Images using Modified U-Net-Based Architecture2023 IEEE 20th India Council International Conference (INDICON)10.1109/INDICON59947.2023.10440904(452-456)Online publication date: 14-Dec-2023
https://doi.org/10.1109/INDICON59947.2023.10440904
Zou LThwal CPark SHong C(2023)Edge-assisted Attention-based Federated Learning for Multi-Step EVSE-enabled Prosumer Energy Demand Prediction2023 International Conference on Information Networking (ICOIN)10.1109/ICOIN56518.2023.10048987(116-121)Online publication date: 11-Jan-2023
https://doi.org/10.1109/ICOIN56518.2023.10048987
Fang JLi JYang HWang YSong S(2023)AMPP: An Adaptive Multilayer Perceptron Prefetcher for Irregular Data Prefetching2023 IEEE International Conference on High Performance Computing & Communications, Data Science & Systems, Smart City & Dependability in Sensor, Cloud & Big Data Systems & Application (HPCC/DSS/SmartCity/DependSys)10.1109/HPCC-DSS-SmartCity-DependSys60770.2023.00059(377-384)Online publication date: 17-Dec-2023
https://doi.org/10.1109/HPCC-DSS-SmartCity-DependSys60770.2023.00059
Xu XTang SZhu MHe PLi SCao Y(2023)A novel model compression method based on joint distillation for deepfake video detectionJournal of King Saud University - Computer and Information Sciences10.1016/j.jksuci.2023.10179235:9(101792)Online publication date: Oct-2023
https://doi.org/10.1016/j.jksuci.2023.101792
Nkolokosa CStothard RJones CStanton MChirombo JTangena J(2023)Monitoring and simulating landscape changes: how do long-term changes in land use and long-term average climate affect regional biophysical conditions in southern Malawi?Environmental Monitoring and Assessment10.1007/s10661-023-11783-9195:10Online publication date: 26-Sep-2023
https://doi.org/10.1007/s10661-023-11783-9
Zou LMunir MTun YHong C(2022)Clustering-Based Serverless Edge Computing Assisted Federated Learning for Energy Procurement2022 23rd Asia-Pacific Network Operations and Management Symposium (APNOMS)10.23919/APNOMS56106.2022.9919944(01-06)Online publication date: 28-Sep-2022
https://doi.org/10.23919/APNOMS56106.2022.9919944
Thirugnanam KEl Moursi MKhadkikar VZeineldin HHosani M(2022)Energy Management Strategy of a Reconfigurable Grid-Tied Hybrid AC/DC Microgrid for Commercial Building ApplicationsIEEE Transactions on Smart Grid10.1109/TSG.2022.314145913:3(1720-1738)Online publication date: May-2022
https://doi.org/10.1109/TSG.2022.3141459
Seng JMcGann L(2022)Autonomous Outdoor Building Navigation Using a Single Monocular Camera2022 3rd International Conference on Artificial Intelligence, Robotics and Control (AIRC)10.1109/AIRC56195.2022.9836985(107-113)Online publication date: 10-May-2022
https://doi.org/10.1109/AIRC56195.2022.9836985

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Figures

Tables

Media

View Table of Conten