Abstract
AI and deep learning are experiencing explosive growth in almost every domain involving analysis of big data. Deep learning using Deep Neural Networks (DNNs) has shown great promise for such scientific data analysis applications. However, traditional CPU-based sequential computing can no longer meet the requirements of mission-critical applications, which are compute-intensive and require low latency and high throughput. Heterogeneous computing (HGC), with CPUs integrated with accelerators such as GPUs and FPGAs, offers unique capabilities to accelerate DNNs. Collaborating researchers at SHREC\(^{1}\) at the University of Florida, NERSC\(^{2}\) at Lawrence Berkeley National Lab, CERN Openlab, Dell EMC, and Intel are studying the application of heterogeneous computing (HGC) to scientific problems using DNN models. This paper focuses on the use of FPGAs to accelerate the inferencing stage of the HGC workflow. We present case studies and results in inferencing state-of-the-art DNN models for scientific data analysis, using Intel distribution of OpenVINO, running on an Intel Programmable Acceleration Card (PAC) equipped with an Arria 10 GX FPGA. Using the Intel Deep Learning Acceleration (DLA) development suite to optimize existing FPGA primitives and develop new ones, we were able accelerate the scientific DNN models under study with a speedup from 3\(\times \) to 6\(\times \) for a single Arria 10 FPGA against a single core (single thread) of a server-class Skylake CPU.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Dell EMC AI challenge. https://insidehpc.com/aichallenge
Abadi, M., et al.: TensorFlow: large-scale machine learning on heterogeneous systems (2015). http://tensorflow.org/. Software available from tensorflow.org
Abdelfattah, M.S., et al.: DLA: compiler and FPGA overlay for neural network inference acceleration. arXiv e-prints arXiv:1807.06434, July 2018
Agostinelli, S., et al.: GEANT4: a simulation toolkit. Nucl. Instrum. Meth. A506, 250–303 (2003). https://doi.org/10.1016/S0168-9002(03)01368-8
Chollet, F., et al.: Keras (2015). https://github.com/fchollet/keras
Dai, J., et al.: BigDL: a distributed deep learning framework for big data. arXiv e-prints arXiv:1804.05839, April 2018
de Favereau, J., et al.: DELPHES 3: a modular framework for fast simulation of a generic collider experiment. J. High Energy Phys. 2014, 57 (2014). https://doi.org/10.1007/JHEP02(2014)057
DeePhi: Deephi dnndk. http://www.deephi.com/technology/dnndk
Duarte, J., et al.: Fast inference of deep neural networks in FPGAs for particle physics. J. Instrum. 13(7), P07027 (2018). https://doi.org/10.1088/1748-0221/13/07/P07027
Dumoulin, V., Visin, F.: A guide to convolution arithmetic for deep learning. ArXiv e-prints, March 2016
Carminati, F., Khattak, G., Vallecorsa, S.: 3D convolutional GAN for fast simulation. Presented at the 23rd International Conference on Computing in High Energy and Nuclear Physics (CHEP 2018). Proceedings in publication
Hahnloser, R.H.R., Sarpeshkar, R., Mahowald, M.A., Douglas, R.J., Seung, H.S.: Digital selection and analogue amplification coexist in a cortex-inspired silicon circuit. Nature 405, 947–951 (2000). https://doi.org/10.1038/35016072
Han, J., Pei, J., Kamber, M.: Data Mining: Concepts and Techniques. Elsevier, Amsterdam (2011)
He, K., Zhang, X., Ren, S., Sun, J.: Delving deep into rectifiers: surpassing human-level performance on imagenet classification. arXiv e-prints arXiv:1502.01852, February 2015
Intel: Openvino toolkit. https://software.intel.com/en-us/openvino-toolkit
Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. arXiv e-prints arXiv:1502.03167, February 2015
Jia, Y., et al.: Caffe: convolutional architecture for fast feature embedding. arXiv preprint arXiv:1408.5093 (2014)
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv e-prints arXiv:1412.6980, December 2014
Kurth, T., et al.: Deep learning at 15PF: supervised and semi-supervised classification for scientific data. arXiv e-prints arXiv:1708.05256, August 2017
Kurth, T.: Hep-cnn github repository. https://github.com/NERSC/hep_cnn_benchmark.git
Lebrun, P., et al.: The CLIC programme: towards a staged e+e\(-\) linear collider exploring the terascale : CLIC conceptual design report (2012). https://doi.org/10.5170/CERN-2012-005
Mustafa, M., Bard, D., Bhimji, W., Lukić, Z., Al-Rfou, R., Kratochvil, J.: CosmoGAN: creating high-fidelity weak lensing convergence maps using generative adversarial networks. arXiv e-prints arXiv:1706.02390, June 2017
Nurvitadhi, E., et al.: Can FPGAs beat GPUs in accelerating next-generation deep neural networks? In: Proceedings of the 2017 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays - FPGA 17 (2017). https://doi.org/10.1145/3020078.3021740
Sjöstrand, T., Mrenna, S., Skands, P.: A brief introduction to PYTHIA 8.1. Comput. Phys. Commun. 178(11), 852–867 (2008). https://doi.org/10.1016/j.cpc.2008.01.036. http://www.sciencedirect.com/science/article/pii/S0010465508000441
Wang, D., An, J., Xu, K.: PipeCNN: an OpenCL-based FPGA accelerator for large-scale convolution neuron networks. arXiv e-prints arXiv:1611.02450, November 2016
Wikipedia: Wikipedia pseudorapidity. https://en.wikipedia.org/wiki/Pseudorapidity
Acknowledgement
This research is funded in part by the NSF SHREC Center and the National Science Foundation (NSF) through its IUCRC Program under Grant No. CNS-1738420; and by NSF CISE Research Infrastructure (CRI) Program Grant No. 1405790.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Jiang, C. et al. (2019). Acceleration of Scientific Deep Learning Models on Heterogeneous Computing Platform with Intel® FPGAs. In: Weiland, M., Juckeland, G., Alam, S., Jagode, H. (eds) High Performance Computing. ISC High Performance 2019. Lecture Notes in Computer Science(), vol 11887. Springer, Cham. https://doi.org/10.1007/978-3-030-34356-9_44
Download citation
DOI: https://doi.org/10.1007/978-3-030-34356-9_44
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-34355-2
Online ISBN: 978-3-030-34356-9
eBook Packages: Computer ScienceComputer Science (R0)