Parallelizing Convolutional Neural Networks on Intel $$^{\textregistered }$$ Many Integrated Core Architecture

Liu, Junjie; Wang, Haixia; Wang, Dongsheng; Gao, Yuan; Li, Zuofeng

doi:10.1007/978-3-319-16086-3_6

Junjie Liu¹⁷,
Haixia Wang¹⁷,
Dongsheng Wang¹⁷,
Yuan Gao¹⁷ &
…
Zuofeng Li¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9017))

Included in the following conference series:

International Conference on Architecture of Computing Systems

1278 Accesses

Abstract

Convolutional neural networks (CNNs) are state-of-the-art machine learning algorithm in low-resolution vision tasks and are widely applied in many applications. However, the training process of them is very time-consuming. As a result, many approaches have been proposed in which parallelization is one of the most effective. In this article, we parallelized a classic CNN on a new platform of Intel$^{{\textregistered }}$ Xeon Phi$^{{{\text {TM}}}}$ Coprocessor with OpenMP. Our implementation acquired 131$\times $ speedup against the serial version running on the coprocessor itself and 8.3$\times $ speedup against the serial baseline on the Xeon$^{{\textregistered }}$ E5-2697 CPU.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

A streaming architecture for Convolutional Neural Networks based on layer operations chaining

Article 04 January 2020

Research on CNN Parallel Computing and Learning Architecture Based on Real-Time Streaming Architecture

Performance Issues of Parallel, Scalable Convolutional Neural Networks in Deep Learning

References

Osadchy, M., Cun, Y.L., Miller, M.L.: Synergistic face detection and pose estimation with energy-based models. The Journal of Machine Learning Research 8, 1197–1215 (2007)
Google Scholar
Matsugu, M., Mori, K., Mitari, Y., Kaneda, Y.: Subject independent facial expression recognition with robust face detection using a convolutional neural network. Neural Networks 16(5), 555–559 (2003)
Article Google Scholar
Ciresan, D., Meier, U., Schmidhuber, J.: Multi-column deep neural networks for image classification. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3642–3649. IEEE (June 2012)
Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)
Google Scholar
Scherer, D., Schulz, H., Behnke, S.: Accelerating large-scale convolutional neural networks with parallel graphics multiprocessors. In: Diamantaras, K., Duch, W., Iliadis, L.S. (eds.) ICANN 2010, Part III. LNCS, vol. 6354, pp. 82–91. Springer, Heidelberg (2010)
Chapter Google Scholar
Huqqani, A.A., Schikuta, E., Ye, S., Chen, P.: Multicore and gpu parallelization of neural networks for face recognition. Procedia Computer Science 18, 349–358 (2013)
Article Google Scholar
Hubel, D.H., Wiesel, T.N.: Receptive fields, binocular interaction and functional architecture in the cat’s visual cortex. The Journal of Physiology 160(1), 106 (1962)
Article Google Scholar
LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998)
Article Google Scholar
Scherer, D., Müller, A., Behnke, S.: Evaluation of pooling operations in convolutional architectures for object recognition. In: Diamantaras, K., Duch, W., Iliadis, L.S. (eds.) ICANN 2010, Part III. LNCS, vol. 6354, pp. 92–101. Springer, Heidelberg (2010)
Chapter Google Scholar
LeCun, Y.A., Bottou, L., Orr, G.B., Müller, K.-R.: Efficient BackProp. In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Neural Networks: Tricks of the Trade, 2nd edn. LNCS, vol. 7700, pp. 9–48. Springer, Heidelberg (2012)
Chapter Google Scholar
Simard, P.Y., Steinkraus, D., Platt, J.C.: Best practices for convolutional neural networks applied to visual document analysis. In: 2013 12th International Conference on Document Analysis and Recognition, vol. 2, pp. 958–958. IEEE Computer Society (August 2003)
Google Scholar

Download references

Author information

Authors and Affiliations

Tsinghua National Laboratory for Information Science and Technology, Beijing, 100084, China
Junjie Liu, Haixia Wang, Dongsheng Wang, Yuan Gao & Zuofeng Li

Authors

Junjie Liu
View author publications
You can also search for this author in PubMed Google Scholar
Haixia Wang
View author publications
You can also search for this author in PubMed Google Scholar
Dongsheng Wang
View author publications
You can also search for this author in PubMed Google Scholar
Yuan Gao
View author publications
You can also search for this author in PubMed Google Scholar
Zuofeng Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Junjie Liu .

Editor information

Editors and Affiliations

CISTER/INESC TEC, ISEP Research Center, Porto, Portugal
Luís Miguel Pinho Pinho
Karlsruher Institut für Technologie, Karlsruhe, Germany
Wolfgang Karl
Inria and École Normale Supérieure, Paris, France
Albert Cohen
Goethe University Fachbereich Informatik und Mathematik, Frankfurt am Main, Germany
Uwe Brinkschulte

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Liu, J., Wang, H., Wang, D., Gao, Y., Li, Z. (2015). Parallelizing Convolutional Neural Networks on Intel$^{\textregistered }$ Many Integrated Core Architecture. In: Pinho, L., Karl, W., Cohen, A., Brinkschulte, U. (eds) Architecture of Computing Systems – ARCS 2015. ARCS 2015. Lecture Notes in Computer Science(), vol 9017. Springer, Cham. https://doi.org/10.1007/978-3-319-16086-3_6

Download citation

DOI: https://doi.org/10.1007/978-3-319-16086-3_6
Published: 11 March 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-16085-6
Online ISBN: 978-3-319-16086-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Parallelizing Convolutional Neural Networks on Intel\(^{\textregistered }\) Many Integrated Core Architecture

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

A streaming architecture for Convolutional Neural Networks based on layer operations chaining

Research on CNN Parallel Computing and Learning Architecture Based on Real-Time Streaming Architecture

Performance Issues of Parallel, Scalable Convolutional Neural Networks in Deep Learning

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Parallelizing Convolutional Neural Networks on Intel\(^{\textregistered }\) Many Integrated Core Architecture

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

A streaming architecture for Convolutional Neural Networks based on layer operations chaining

Research on CNN Parallel Computing and Learning Architecture Based on Real-Time Streaming Architecture

Performance Issues of Parallel, Scalable Convolutional Neural Networks in Deep Learning

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation