Research on Parallel Architecture of OpenCL-Based FPGA

Zhang, Yi; Cai, Ye; Luo, Qiuming

doi:10.1007/978-3-319-73830-7_4

Research on Parallel Architecture of OpenCL-Based FPGA

Yi Zhang¹⁴,
Ye Cai¹⁴ &
Qiuming Luo¹⁴

Conference paper
First Online: 18 January 2018

1850 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10699))

Abstract

Moore’s law encounters a bottleneck today. Computing power of the general purpose processor is restricted. At the same time, new types of enterprise computing such as big data management and analysis bring more challenges to the computational performance and scalability of the data center. Research efforts have been devoted to accelerating algorithm on Field Programmable Gate Arrays (FPGAs), due to their high performance and reprogramming. In this paper, we first study the heterogeneous platform of OpenCL-based FPGA, and propose a novel multi-computing unit combined with internal hardware flow parallel acceleration framework. Then, we evaluate the influences of different number of computing units on performance and resource utilization with the high performance computing applications (AES algorithm) that implemented through the proposed framework. Meanwhile, we compare the performance with CPU implementation. The result shows that our proposed framework has advantages of high performance and scalability for the implementation of a class of algorithms suitable for parallelization, and suits for the demands of data center and high performance computing (HPC) applications.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Esmaeilzadeh, H., Blem, E., Amant, R.S., Sankaralingam, K., Burger, D.: Dark silicon and the end of multicore scaling. IEEE Micro 32, 122–134 (2012)
Article Google Scholar
Vestias, M., Neto, H.: Trends of CPU, GPU and FPGA for high-performance computing. In: International Conference on Field Programmable Logic and Applications, pp. 1–6 (2014)
Google Scholar
Gai, K., Qiu, M., Zhao, H., Tao, L., Zong, Z.: Dynamic energy-aware cloudlet-based mobile cloud computing model for green computing. J. Netw. Comput. Appl. 59, 46–54 (2016)
Article Google Scholar
Gai, K., Qiu, M., Zhao, H.: Energy-aware task assignment for mobile cyber-enabled applications in heterogeneous cloud computing. J. Parallel Distrib. Comput. 111, 126–135 (2017)
Article Google Scholar
Horowitz, M.: 1.1 Computing’s energy problem (and what we can do about it). In: Solid-State Circuits Conference Digest of Technical Papers, pp. 10–14 (2014)
Google Scholar
Nurvitadhi, E., Sheffield, D., Sim, J., Mishra, A., Venkatesh, G., Marr, D.: Accelerating binarized neural networks: comparison of FPGA, CPU, GPU, and ASIC. In: International Conference on Field-Programmable Technology, pp. 77–84 (2017)
Google Scholar
Muslim, F., Liang, M., Roozmeh, M., Lavagno, L.: Efficient FPGA implementation of OpenCL high-performance computing applications via high-level synthesis. IEEE Access. PP, 1 (2017)
Google Scholar
Putnam, A., Caulfield, A.M., Chung, E.S., Chiou, D.: A reconfigurable fabric for accelerating large-scale datacenter services. In: ACM/IEEE International Symposium on Computer Architecture, pp. 13–24 (2014)
Google Scholar
Ouyang, J.: SDA: software-defined accelerator for large-scale deep learning system. In: International Symposium on VLSI Design, Automation and Test, p. 1 (2016)
Google Scholar
Hodjat, A., Verbauwhede, I.: A 21.54 Gbits/s fully pipelined AES processor on FPGA. IEEE (2004)
Google Scholar
Sukhsawas, S., Benkrid, K.: A high-level implementation of a high performance pipeline FFT on Virtex-E FPGAs. In: Proceedings of the IEEE Computer Society Symposium on VLSI, pp. 229–232 (2004)
Google Scholar
Stone, J.E., Gohara, D., Shi, G.: OpenCL: a parallel programming standard for heterogeneous computing systems. Comput. Sci. Eng. 12, 66–73 (2010)
Article Google Scholar
Czajkowski, T.S., Aydonat, U., Denisenko, D., Freeman, J.: From opencl to high-performance hardware on FPGAS. In: International Conference on Field Programmable Logic and Applications, pp. 531–534 (2012)
Google Scholar
Guidi, G., Reggiani, E., Di Tucci, L., Durelli, G., Blott, M., Santambrogio, M.D.: On How to improve FPGA-based systems design productivity via SDAccel. In: Proceedings of the IEEE 28th International Parallel Distributed Processing Symposium Workshops, IPDPSW 2014, pp. 247–252, August 2016
Google Scholar
Yang, Y.S., Bahn, J.H., Lee, S.E., Bagherzadeh, N.: Parallel and pipeline processing for block cipher algorithms on a network-on-chip. In: Sixth International Conference on Information Technology: New Generations, pp. 849–854 (2009)
Google Scholar
Palmer, J., Nelson, B.: A Parallel FFT architecture for FPGAs. In: Proceedings of the Field Programmable Logic and Application, International Conference, FPL 2004, Leuven, Belgium, 30 August–1 September, pp. 948–953 (2004)
Google Scholar
Kumar, A., Verma, G., Nath, V., Choudhury, S.: IC Packaging: 3D IC Technology and Methods (2017)
Google Scholar
Shan, Y., Wang, B., Yan, J., Wang, Y., Xu, N., Yang, H.: FPMR: MapReduce framework on FPGA. In: ACM/SIGDA International Symposium on Field Programmable Gate Arrays, FPGA 2010, Monterey, California, USA, pp. 93–102, February 2010
Google Scholar
Hussain, H.M., Benkrid, K., Seker, H., Erdogan, A.T.: FPGA implementation of K-means algorithm for bioinformatics application: an accelerated approach to clustering Microarray data. In: Adaptive Hardware and Systems, pp. 248–255 (2011)
Google Scholar
Zhang, C., Li, P., Sun, G., Guan, Y., Xiao, B., Cong, J.: Optimizing FPGA-based accelerator design for deep convolutional neural networks. In: Proceedings of the 2015 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays - FPGA 2015, pp. 161–170 (2015)
Google Scholar
Han, S., Kang, J., Mao, H., Hu, Y., Li, X., Li, Y., Xie, D., Luo, H., Yao, S., Wang, Y.: ESE: efficient speech recognition engine with sparse LSTM on FPGA (2016)
Google Scholar
Ouyang, J., Lin, S., Qi, W., Wang, Y., Yu, B., Jiang, S.: SDA: Software-defined accelerator for large-scale DNN systems. In: Hot Chips 26 Symposium, pp. 1–23 (2014)
Google Scholar

Download references

Acknowledgement

The research was jointly supported by project grant from Shenzhen Science &Technology Foundation: JCYJ20150930105133185/JCYJ20170302153920897, National Natural Science Foundation of China: NSF/GDU1301252, and the higher education reformation project of Guangdong Provincial Department of Education: “Research on teaching reform of computer hardware series lessons based on system view”, 20150819.

Author information

Authors and Affiliations

College of Computer Science and Software Engineering, Shenzhen University, Shenzhen, China
Yi Zhang, Ye Cai & Qiuming Luo

Authors

Yi Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Ye Cai
View author publications
You can also search for this author in PubMed Google Scholar
Qiuming Luo
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yi Zhang .

Editor information

Editors and Affiliations

Columbia University, New York, New York, USA
Meikang Qiu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhang, Y., Cai, Y., Luo, Q. (2018). Research on Parallel Architecture of OpenCL-Based FPGA. In: Qiu, M. (eds) Smart Computing and Communication. SmartCom 2017. Lecture Notes in Computer Science(), vol 10699. Springer, Cham. https://doi.org/10.1007/978-3-319-73830-7_4

Download citation

DOI: https://doi.org/10.1007/978-3-319-73830-7_4
Published: 18 January 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-73829-1
Online ISBN: 978-3-319-73830-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics