Skip to main content

Acceleration of Scientific Deep Learning Models on Heterogeneous Computing Platform with Intel® FPGAs

  • Conference paper
  • First Online:
High Performance Computing (ISC High Performance 2019)

Abstract

AI and deep learning are experiencing explosive growth in almost every domain involving analysis of big data. Deep learning using Deep Neural Networks (DNNs) has shown great promise for such scientific data analysis applications. However, traditional CPU-based sequential computing can no longer meet the requirements of mission-critical applications, which are compute-intensive and require low latency and high throughput. Heterogeneous computing (HGC), with CPUs integrated with accelerators such as GPUs and FPGAs, offers unique capabilities to accelerate DNNs. Collaborating researchers at SHREC\(^{1}\) at the University of Florida, NERSC\(^{2}\) at Lawrence Berkeley National Lab, CERN Openlab, Dell EMC, and Intel are studying the application of heterogeneous computing (HGC) to scientific problems using DNN models. This paper focuses on the use of FPGAs to accelerate the inferencing stage of the HGC workflow. We present case studies and results in inferencing state-of-the-art DNN models for scientific data analysis, using Intel distribution of OpenVINO, running on an Intel Programmable Acceleration Card (PAC) equipped with an Arria 10 GX FPGA. Using the Intel Deep Learning Acceleration (DLA) development suite to optimize existing FPGA primitives and develop new ones, we were able accelerate the scientific DNN models under study with a speedup from 3\(\times \) to 6\(\times \) for a single Arria 10 FPGA against a single core (single thread) of a server-class Skylake CPU.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Dell EMC AI challenge. https://insidehpc.com/aichallenge

  2. Abadi, M., et al.: TensorFlow: large-scale machine learning on heterogeneous systems (2015). http://tensorflow.org/. Software available from tensorflow.org

  3. Abdelfattah, M.S., et al.: DLA: compiler and FPGA overlay for neural network inference acceleration. arXiv e-prints arXiv:1807.06434, July 2018

  4. Agostinelli, S., et al.: GEANT4: a simulation toolkit. Nucl. Instrum. Meth. A506, 250–303 (2003). https://doi.org/10.1016/S0168-9002(03)01368-8

    Article  Google Scholar 

  5. Chollet, F., et al.: Keras (2015). https://github.com/fchollet/keras

  6. Dai, J., et al.: BigDL: a distributed deep learning framework for big data. arXiv e-prints arXiv:1804.05839, April 2018

  7. de Favereau, J., et al.: DELPHES 3: a modular framework for fast simulation of a generic collider experiment. J. High Energy Phys. 2014, 57 (2014). https://doi.org/10.1007/JHEP02(2014)057

    Article  Google Scholar 

  8. DeePhi: Deephi dnndk. http://www.deephi.com/technology/dnndk

  9. Duarte, J., et al.: Fast inference of deep neural networks in FPGAs for particle physics. J. Instrum. 13(7), P07027 (2018). https://doi.org/10.1088/1748-0221/13/07/P07027

    Article  Google Scholar 

  10. Dumoulin, V., Visin, F.: A guide to convolution arithmetic for deep learning. ArXiv e-prints, March 2016

    Google Scholar 

  11. Carminati, F., Khattak, G., Vallecorsa, S.: 3D convolutional GAN for fast simulation. Presented at the 23rd International Conference on Computing in High Energy and Nuclear Physics (CHEP 2018). Proceedings in publication

    Google Scholar 

  12. Hahnloser, R.H.R., Sarpeshkar, R., Mahowald, M.A., Douglas, R.J., Seung, H.S.: Digital selection and analogue amplification coexist in a cortex-inspired silicon circuit. Nature 405, 947–951 (2000). https://doi.org/10.1038/35016072

    Article  Google Scholar 

  13. Han, J., Pei, J., Kamber, M.: Data Mining: Concepts and Techniques. Elsevier, Amsterdam (2011)

    MATH  Google Scholar 

  14. He, K., Zhang, X., Ren, S., Sun, J.: Delving deep into rectifiers: surpassing human-level performance on imagenet classification. arXiv e-prints arXiv:1502.01852, February 2015

  15. Intel: Openvino toolkit. https://software.intel.com/en-us/openvino-toolkit

  16. Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. arXiv e-prints arXiv:1502.03167, February 2015

  17. Jia, Y., et al.: Caffe: convolutional architecture for fast feature embedding. arXiv preprint arXiv:1408.5093 (2014)

  18. Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv e-prints arXiv:1412.6980, December 2014

  19. Kurth, T., et al.: Deep learning at 15PF: supervised and semi-supervised classification for scientific data. arXiv e-prints arXiv:1708.05256, August 2017

  20. Kurth, T.: Hep-cnn github repository. https://github.com/NERSC/hep_cnn_benchmark.git

  21. Lebrun, P., et al.: The CLIC programme: towards a staged e+e\(-\) linear collider exploring the terascale : CLIC conceptual design report (2012). https://doi.org/10.5170/CERN-2012-005

  22. Mustafa, M., Bard, D., Bhimji, W., Lukić, Z., Al-Rfou, R., Kratochvil, J.: CosmoGAN: creating high-fidelity weak lensing convergence maps using generative adversarial networks. arXiv e-prints arXiv:1706.02390, June 2017

  23. Nurvitadhi, E., et al.: Can FPGAs beat GPUs in accelerating next-generation deep neural networks? In: Proceedings of the 2017 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays - FPGA 17 (2017). https://doi.org/10.1145/3020078.3021740

  24. Sjöstrand, T., Mrenna, S., Skands, P.: A brief introduction to PYTHIA 8.1. Comput. Phys. Commun. 178(11), 852–867 (2008). https://doi.org/10.1016/j.cpc.2008.01.036. http://www.sciencedirect.com/science/article/pii/S0010465508000441

    Article  MATH  Google Scholar 

  25. Wang, D., An, J., Xu, K.: PipeCNN: an OpenCL-based FPGA accelerator for large-scale convolution neuron networks. arXiv e-prints arXiv:1611.02450, November 2016

  26. Wikipedia: Wikipedia pseudorapidity. https://en.wikipedia.org/wiki/Pseudorapidity

Download references

Acknowledgement

This research is funded in part by the NSF SHREC Center and the National Science Foundation (NSF) through its IUCRC Program under Grant No. CNS-1738420; and by NSF CISE Research Infrastructure (CRI) Program Grant No. 1405790.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chao Jiang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Jiang, C. et al. (2019). Acceleration of Scientific Deep Learning Models on Heterogeneous Computing Platform with Intel® FPGAs. In: Weiland, M., Juckeland, G., Alam, S., Jagode, H. (eds) High Performance Computing. ISC High Performance 2019. Lecture Notes in Computer Science(), vol 11887. Springer, Cham. https://doi.org/10.1007/978-3-030-34356-9_44

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-34356-9_44

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-34355-2

  • Online ISBN: 978-3-030-34356-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics