Distributed B-SDLM: Accelerating the Training Convergence of Deep Neural Networks Through Parallelism

Liew, Shan Sung; Khalil-Hani, Mohamed; Bakhteri, Rabia

doi:10.1007/978-3-319-42911-3_20

Shan Sung Liew¹⁵,
Mohamed Khalil-Hani¹⁵ &
Rabia Bakhteri¹⁶

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9810))

Included in the following conference series:

Pacific Rim International Conference on Artificial Intelligence

2603 Accesses

Abstract

This paper proposes an efficient asynchronous stochastic second order learning algorithm for distributed learning of neural networks (NNs). The proposed algorithm, named distributed bounded stochastic diagonal Levenberg-Marquardt (distributed B-SDLM), is based on the B-SDLM algorithm that converges fast and requires only minimal computational overhead than the stochastic gradient descent (SGD) method. The proposed algorithm is implemented based on the parameter server thread model in the MPICH implementation. Experiments on the MNIST dataset have shown that training using the distributed B-SDLM on a 16-core CPU cluster allows the convolutional neural network (CNN) model to reach the convergence state very fast, with speedups of \(6.03{\times }\) and \(12.28{\times }\) to reach 0.01 training and 0.08 testing loss values, respectively. This also results in significantly less time taken to reach a certain classification accuracy (\(5.67{\times }\) and \(8.72{\times }\) faster to reach \(99\,\%\) training and \(98\,\%\) testing accuracies on the MNIST dataset, respectively).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Agarwal, A., Chapelle, O., Dudík, M., Langford, J.: A reliable effective terascale linear learning system. J. Mach. Learn. Res. 15(1), 1111–1133 (2014)
MathSciNet MATH Google Scholar
Chen, Y., Yang, X., Zhong, B., Pan, S., Chen, D., Zhang, H.: CNNTracker: online discriminative object tracking via deep convolutional neural network. Appl. Soft Comput. 38, 1088–1098 (2016)
Article Google Scholar
Dean, J., Corrado, G., Monga, R., Chen, K., Devin, M., Mao, M., Ranzato, M., Senior, A., Tucker, P., Yang, K., Le, Q.V., Ng, A.Y.: Largescale distributed deep networks, pp. 1223–1231. Curran Associates, Inc. (2012)
Google Scholar
Gebali, F.: Algorithms and Parallel Computing, vol. 84. Wiley, New York (2011)
Book MATH Google Scholar
Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)
Article Google Scholar
LeCun, Y.A., Bottou, L., Orr, G.B., Müller, K.-R.: Efficient BackProp. In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Neural Networks: Tricks of the Trade, 2nd edn. LNCS, vol. 7700, pp. 9–48. Springer, Heidelberg (2012)
Chapter Google Scholar
Liew, S.S., Khalil-Hani, M., Bakhteri, R.: An optimized second order stochastic learning algorithm for neural network training. Neurocomputing 186, 74–89 (2016)
Article Google Scholar
Zeiler, M.D.: ADADELTA: an adaptive learning rate method (2012). arXiv:1212.5701

Download references

Acknowledgements

This work is supported by Universiti Teknologi Malaysia (UTM) and the Ministry of Science, Technology and Innovation of Malaysia (MOSTI) under the Science Fund Grant No. 4S116.

Author information

Authors and Affiliations

VeCAD Research Laboratory, Faculty of Electrical Engineering, Universiti Teknologi Malaysia, 81310, Skudai, Johor, Malaysia
Shan Sung Liew & Mohamed Khalil-Hani
Machine Learning Developer Group, Sightline Innovation, #202, 435 Ellice Avenue, Winnipeg, MB, R3B 1Y6, Canada
Rabia Bakhteri

Authors

Shan Sung Liew
View author publications
You can also search for this author in PubMed Google Scholar
Mohamed Khalil-Hani
View author publications
You can also search for this author in PubMed Google Scholar
Rabia Bakhteri
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shan Sung Liew .

Editor information

Editors and Affiliations

Cardiff University, Cardiff, United Kingdom
Richard Booth
Southeast University , Nanjing, China
Min-Ling Zhang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Liew, S.S., Khalil-Hani, M., Bakhteri, R. (2016). Distributed B-SDLM: Accelerating the Training Convergence of Deep Neural Networks Through Parallelism. In: Booth, R., Zhang, ML. (eds) PRICAI 2016: Trends in Artificial Intelligence. PRICAI 2016. Lecture Notes in Computer Science(), vol 9810. Springer, Cham. https://doi.org/10.1007/978-3-319-42911-3_20

Download citation

DOI: https://doi.org/10.1007/978-3-319-42911-3_20
Published: 10 August 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-42910-6
Online ISBN: 978-3-319-42911-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics