Abstract
Moore’s law encounters a bottleneck today. Computing power of the general purpose processor is restricted. At the same time, new types of enterprise computing such as big data management and analysis bring more challenges to the computational performance and scalability of the data center. Research efforts have been devoted to accelerating algorithm on Field Programmable Gate Arrays (FPGAs), due to their high performance and reprogramming. In this paper, we first study the heterogeneous platform of OpenCL-based FPGA, and propose a novel multi-computing unit combined with internal hardware flow parallel acceleration framework. Then, we evaluate the influences of different number of computing units on performance and resource utilization with the high performance computing applications (AES algorithm) that implemented through the proposed framework. Meanwhile, we compare the performance with CPU implementation. The result shows that our proposed framework has advantages of high performance and scalability for the implementation of a class of algorithms suitable for parallelization, and suits for the demands of data center and high performance computing (HPC) applications.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Esmaeilzadeh, H., Blem, E., Amant, R.S., Sankaralingam, K., Burger, D.: Dark silicon and the end of multicore scaling. IEEE Micro 32, 122–134 (2012)
Vestias, M., Neto, H.: Trends of CPU, GPU and FPGA for high-performance computing. In: International Conference on Field Programmable Logic and Applications, pp. 1–6 (2014)
Gai, K., Qiu, M., Zhao, H., Tao, L., Zong, Z.: Dynamic energy-aware cloudlet-based mobile cloud computing model for green computing. J. Netw. Comput. Appl. 59, 46–54 (2016)
Gai, K., Qiu, M., Zhao, H.: Energy-aware task assignment for mobile cyber-enabled applications in heterogeneous cloud computing. J. Parallel Distrib. Comput. 111, 126–135 (2017)
Horowitz, M.: 1.1 Computing’s energy problem (and what we can do about it). In: Solid-State Circuits Conference Digest of Technical Papers, pp. 10–14 (2014)
Nurvitadhi, E., Sheffield, D., Sim, J., Mishra, A., Venkatesh, G., Marr, D.: Accelerating binarized neural networks: comparison of FPGA, CPU, GPU, and ASIC. In: International Conference on Field-Programmable Technology, pp. 77–84 (2017)
Muslim, F., Liang, M., Roozmeh, M., Lavagno, L.: Efficient FPGA implementation of OpenCL high-performance computing applications via high-level synthesis. IEEE Access. PP, 1 (2017)
Putnam, A., Caulfield, A.M., Chung, E.S., Chiou, D.: A reconfigurable fabric for accelerating large-scale datacenter services. In: ACM/IEEE International Symposium on Computer Architecture, pp. 13–24 (2014)
Ouyang, J.: SDA: software-defined accelerator for large-scale deep learning system. In: International Symposium on VLSI Design, Automation and Test, p. 1 (2016)
Hodjat, A., Verbauwhede, I.: A 21.54 Gbits/s fully pipelined AES processor on FPGA. IEEE (2004)
Sukhsawas, S., Benkrid, K.: A high-level implementation of a high performance pipeline FFT on Virtex-E FPGAs. In: Proceedings of the IEEE Computer Society Symposium on VLSI, pp. 229–232 (2004)
Stone, J.E., Gohara, D., Shi, G.: OpenCL: a parallel programming standard for heterogeneous computing systems. Comput. Sci. Eng. 12, 66–73 (2010)
Czajkowski, T.S., Aydonat, U., Denisenko, D., Freeman, J.: From opencl to high-performance hardware on FPGAS. In: International Conference on Field Programmable Logic and Applications, pp. 531–534 (2012)
Guidi, G., Reggiani, E., Di Tucci, L., Durelli, G., Blott, M., Santambrogio, M.D.: On How to improve FPGA-based systems design productivity via SDAccel. In: Proceedings of the IEEE 28th International Parallel Distributed Processing Symposium Workshops, IPDPSW 2014, pp. 247–252, August 2016
Yang, Y.S., Bahn, J.H., Lee, S.E., Bagherzadeh, N.: Parallel and pipeline processing for block cipher algorithms on a network-on-chip. In: Sixth International Conference on Information Technology: New Generations, pp. 849–854 (2009)
Palmer, J., Nelson, B.: A Parallel FFT architecture for FPGAs. In: Proceedings of the Field Programmable Logic and Application, International Conference, FPL 2004, Leuven, Belgium, 30 August–1 September, pp. 948–953 (2004)
Kumar, A., Verma, G., Nath, V., Choudhury, S.: IC Packaging: 3D IC Technology and Methods (2017)
Shan, Y., Wang, B., Yan, J., Wang, Y., Xu, N., Yang, H.: FPMR: MapReduce framework on FPGA. In: ACM/SIGDA International Symposium on Field Programmable Gate Arrays, FPGA 2010, Monterey, California, USA, pp. 93–102, February 2010
Hussain, H.M., Benkrid, K., Seker, H., Erdogan, A.T.: FPGA implementation of K-means algorithm for bioinformatics application: an accelerated approach to clustering Microarray data. In: Adaptive Hardware and Systems, pp. 248–255 (2011)
Zhang, C., Li, P., Sun, G., Guan, Y., Xiao, B., Cong, J.: Optimizing FPGA-based accelerator design for deep convolutional neural networks. In: Proceedings of the 2015 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays - FPGA 2015, pp. 161–170 (2015)
Han, S., Kang, J., Mao, H., Hu, Y., Li, X., Li, Y., Xie, D., Luo, H., Yao, S., Wang, Y.: ESE: efficient speech recognition engine with sparse LSTM on FPGA (2016)
Ouyang, J., Lin, S., Qi, W., Wang, Y., Yu, B., Jiang, S.: SDA: Software-defined accelerator for large-scale DNN systems. In: Hot Chips 26 Symposium, pp. 1–23 (2014)
Acknowledgement
The research was jointly supported by project grant from Shenzhen Science &Technology Foundation: JCYJ20150930105133185/JCYJ20170302153920897, National Natural Science Foundation of China: NSF/GDU1301252, and the higher education reformation project of Guangdong Provincial Department of Education: “Research on teaching reform of computer hardware series lessons based on system view”, 20150819.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG
About this paper
Cite this paper
Zhang, Y., Cai, Y., Luo, Q. (2018). Research on Parallel Architecture of OpenCL-Based FPGA. In: Qiu, M. (eds) Smart Computing and Communication. SmartCom 2017. Lecture Notes in Computer Science(), vol 10699. Springer, Cham. https://doi.org/10.1007/978-3-319-73830-7_4
Download citation
DOI: https://doi.org/10.1007/978-3-319-73830-7_4
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-73829-1
Online ISBN: 978-3-319-73830-7
eBook Packages: Computer ScienceComputer Science (R0)