Skip to main content

Distributed B-SDLM: Accelerating the Training Convergence of Deep Neural Networks Through Parallelism

  • Conference paper
  • First Online:
Book cover PRICAI 2016: Trends in Artificial Intelligence (PRICAI 2016)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9810))

Included in the following conference series:

  • 2603 Accesses

Abstract

This paper proposes an efficient asynchronous stochastic second order learning algorithm for distributed learning of neural networks (NNs). The proposed algorithm, named distributed bounded stochastic diagonal Levenberg-Marquardt (distributed B-SDLM), is based on the B-SDLM algorithm that converges fast and requires only minimal computational overhead than the stochastic gradient descent (SGD) method. The proposed algorithm is implemented based on the parameter server thread model in the MPICH implementation. Experiments on the MNIST dataset have shown that training using the distributed B-SDLM on a 16-core CPU cluster allows the convolutional neural network (CNN) model to reach the convergence state very fast, with speedups of \(6.03{\times }\) and \(12.28{\times }\) to reach 0.01 training and 0.08 testing loss values, respectively. This also results in significantly less time taken to reach a certain classification accuracy (\(5.67{\times }\) and \(8.72{\times }\) faster to reach \(99\,\%\) training and \(98\,\%\) testing accuracies on the MNIST dataset, respectively).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Agarwal, A., Chapelle, O., Dudík, M., Langford, J.: A reliable effective terascale linear learning system. J. Mach. Learn. Res. 15(1), 1111–1133 (2014)

    MathSciNet  MATH  Google Scholar 

  2. Chen, Y., Yang, X., Zhong, B., Pan, S., Chen, D., Zhang, H.: CNNTracker: online discriminative object tracking via deep convolutional neural network. Appl. Soft Comput. 38, 1088–1098 (2016)

    Article  Google Scholar 

  3. Dean, J., Corrado, G., Monga, R., Chen, K., Devin, M., Mao, M., Ranzato, M., Senior, A., Tucker, P., Yang, K., Le, Q.V., Ng, A.Y.: Largescale distributed deep networks, pp. 1223–1231. Curran Associates, Inc. (2012)

    Google Scholar 

  4. Gebali, F.: Algorithms and Parallel Computing, vol. 84. Wiley, New York (2011)

    Book  MATH  Google Scholar 

  5. Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)

    Article  Google Scholar 

  6. LeCun, Y.A., Bottou, L., Orr, G.B., Müller, K.-R.: Efficient BackProp. In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Neural Networks: Tricks of the Trade, 2nd edn. LNCS, vol. 7700, pp. 9–48. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  7. Liew, S.S., Khalil-Hani, M., Bakhteri, R.: An optimized second order stochastic learning algorithm for neural network training. Neurocomputing 186, 74–89 (2016)

    Article  Google Scholar 

  8. Zeiler, M.D.: ADADELTA: an adaptive learning rate method (2012). arXiv:1212.5701

Download references

Acknowledgements

This work is supported by Universiti Teknologi Malaysia (UTM) and the Ministry of Science, Technology and Innovation of Malaysia (MOSTI) under the Science Fund Grant No. 4S116.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shan Sung Liew .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Liew, S.S., Khalil-Hani, M., Bakhteri, R. (2016). Distributed B-SDLM: Accelerating the Training Convergence of Deep Neural Networks Through Parallelism. In: Booth, R., Zhang, ML. (eds) PRICAI 2016: Trends in Artificial Intelligence. PRICAI 2016. Lecture Notes in Computer Science(), vol 9810. Springer, Cham. https://doi.org/10.1007/978-3-319-42911-3_20

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-42911-3_20

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-42910-6

  • Online ISBN: 978-3-319-42911-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics