Abstract
Executing complex scientific applications on Coarse Grain Reconfigurable Arrays (CGRAs) promises execution time and/or energy consumption reduction compared to software execution or even customized hardware solutions. The compute core of CGRA architectures is a cell that typically consists of simple and generic hardware units, such as ALUs, simple processors, or even custom logic tailored to an application’s specific characteristics. However generality in the cell contents, while convenient for serving multiple applications, comes at the cost of execution acceleration and energy consumption.
This work proposes a novel Mixed-CGRA Definition Framework (MC-DeF) targeting a Mixed-CGRA architecture that leverages the advantages of CGRAs by utilizing a customized cell-array, and FPGAs by utilizing a separate LUT array used for adaptability. Our framework employs a custom cell structure and functionality definition phase to create highly customized application/domain specific CGRA designs. This is achieved through the use of cost functions that use metrics such a resource usage, connectivity overhead, chip area occupied, i.a., and user-defined threshold values. Thus, the framework aids the user in creating suitable designs based on the application’s needs and/or design restrictions, energy and/or area constraints.
We evaluate our framework using three applications: Hayashi-Yoshida, Mutual Information and Transfer Entropy and present fully functional, FPGA-based implementations of these applications to demonstrate the validity of our framework. Comparisons with related work show that MC-DeF performs favourably in terms of processing throughput - even when compared with much larger designs, uses fewer resources than most of the compared architectures, while utilizing better the underlying architecture recording the second best efficiency (LUT/GOPs) rating.
This research is supported in part by the General Secretariat for Research and Technology (GSRT) and the Hellenic Foundation for Research and Innovation (HFRI).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Ahmed, E., Rose, J.: The effect of LUT and cluster size on deep-submicron FPGA performance and density. IEEE Trans. Very Large Scale Integr. (VLSI) Syst. 12(3), 288–298 (2004)
Alle, M., et al.: REDEFINE: runtime reconfigurable polymorphic ASIC. ACM Trans. Embed. Comput. Syst. 9(2), 11:1–11:48 (2009)
Ansaloni, G., Bonzini, P., Pozzi, L.: EGRA: a coarse grained reconfigurable architectural template. IEEE Trans. Very Large Scale Integr. Syst. 19(6), 1062–1074 (2011)
Chang, J., et al.: 12.1 A 7nm 256 Mb SRAM in high-k metal-gate FinFET technology with write-assist circuitry for low-VMIN applications. In: 2017 IEEE International Solid-State Circuits Conference (ISSCC), pp. 206–207, February 2017
Charitopoulos, G., Pnevmatikatos, D.N.: DARSA: a dataflow analysis tool for reconfigurable platforms. In: 18th International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation, SAMOS 2018, pp. 65–72 (2018)
Clark, N., Zhong, H., Mahlke, S.: Processor acceleration through automated instruction set customization. In: Proceedings of 36th Annual IEEE/ACM International Symposium on Microarchitecture, MICRO-36, pp. 129–140 (2003)
Coole, J., Stitt, G.: Intermediate fabrics: virtual architectures for circuit portability and fast placement and routing. In: 2010 IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS), pp. 13–22, October 2010
Dally, B.: Challenges for future computing systems. Presentation in HiPEAC Conference (2015)
De Sutter, B., Raghavan, P., Lambrechts, A.: Coarse-grained reconfigurable array architectures. In: Bhattacharyya, S.S., Deprettere, E.F., Leupers, R., Takala, J. (eds.) Handbook of Signal Processing Systems, pp. 427–472. Springer, Cham (2019). https://doi.org/10.1007/978-3-319-91734-4_12
Elseidy, M., Abdelhamid, E., Skiadopoulos, S., Kalnis, P.: GRAMI: frequent subgraph and pattern mining in a single large graph. Proc. VLDB Endow. 7(7), 517–528 (2014)
Govindaraju, V., et al.: DySER: unifying functionality and parallelism specialization for energy-efficient computing. IEEE Micro 32(5), 38–51 (2012)
Govindaraju, V., Ho, C., Sankaralingam, K.: Dynamically specialized datapaths for energy efficient computing. In: 2011 IEEE 17th International Symposium on High Performance Computer Architecture, pp. 503–514, February 2011
Hartenstein, R.: Coarse grain reconfigurable architecture (embedded tutorial). In: Proceedings of the 2001 Asia and South Pacific Design Automation Conference, DAC 2001, pp. 564–570. ACM (2001)
Hayashi, T., Yoshida, N.: On covariance estimation of non-synchronously observed diffusion processes. Bernoulli 11(2), 359–379 (2005)
Hu, W.H., Lee, S.E., Bagherzadeh, N.: DMesh: a diagonally-linked mesh network-on-chip architecture. In: Network on Chip Architectures, p. 14 (2008)
Iordanou, K., Nikolakaki, S.M., Malakonakis, P., Dollas, A.: A performance evaluation of multi-FPGA architectures for computations of information transfer. In: 18th International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation, SAMOS 2018, pp. 1–9 (2018)
Jain, A.K., Fahmy, S.A., Maskell, D.L.: Efficient overlay architecture based on DSP blocks. In: 2015 IEEE 23rd Annual International Symposium on Field-Programmable Custom Computing Machines, pp. 25–28, May 2015
Jain, A.K., Li, X., Singhai, P., Maskell, D.L., Fahmy, S.A.: DeCO: a DSP block based FPGA accelerator overlay with low overhead interconnect. In: 2016 IEEE 24th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM), pp. 1–8, May 2016
Jain, A.K., Maskell, D.L., Fahmy, S.A.: Are coarse-grained overlays ready for general purpose application acceleration on FPGAs? In: 2016 IEEE 14th International Conference on Dependable, Autonomic and Secure Computing, 14th International Conference on Pervasive Intelligence and Computing, (DASC/PiCom/DataCom/CyberSciTech), pp. 586–593, August 2016
Keckler, S.W., Dally, W.J., Khailany, B., Garland, M., Glasco, D.: GPUs and the future of parallel computing. IEEE Micro 31(5), 7–17 (2011)
Madhu, K.T., Das, S., Nalesh, S., Nandy, S.K., Narayan, R.: Compiling HPC kernels for the redefine CGRA. In: IEEE 17th International Conference on High Performance Computing and Communications, and 12th International Conference on Embedded Software and Systems, pp. 405–410, August 2015
Niedermeier, A., Kuper, J., Smit, G.J.M.: A dataflow inspired programming paradigm for coarse-grained reconfigurable arrays. In: Goehringer, D., Santambrogio, M.D., Cardoso, J.M.P., Bertels, K. (eds.) ARC 2014. LNCS, vol. 8405, pp. 275–282. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-05960-0_29
Pell, O., Averbukh, V.: Maximum performance computing with dataflow engines. Comput. Sci. Eng. 14(4), 98–103 (2012)
Sen, M., et al.: Dataflow-based mapping of computer vision algorithms onto FPGAs. EURASIP J. Embedded Syst. 2007(1), 049236 (2007)
Standaert, T., et al.: BEOL process integration for the 7 nm technology node. In: 2016 IEEE International Interconnect Technology Conference/Advanced Metallization Conference (IITC/AMC), pp. 2–4, May 2016
Stojilović, M., Novo, D., Saranovac, L., Brisk, P., Ienne, P.: Selective flexibility: creating domain-specific reconfigurable arrays. IEEE Trans. Comput. Aided Des. Integr. Circuits Syst. 32(5), 681–694 (2013)
Xilinx: 7 Series FPGAs Data Sheet: Overview, rev. 2.6, February 2018
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Charitopoulos, G., Pnevmatikatos, D.N. (2020). A CGRA Definition Framework for Dataflow Applications. In: Rincón, F., Barba, J., So, H., Diniz, P., Caba, J. (eds) Applied Reconfigurable Computing. Architectures, Tools, and Applications. ARC 2020. Lecture Notes in Computer Science(), vol 12083. Springer, Cham. https://doi.org/10.1007/978-3-030-44534-8_21
Download citation
DOI: https://doi.org/10.1007/978-3-030-44534-8_21
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-44533-1
Online ISBN: 978-3-030-44534-8
eBook Packages: Computer ScienceComputer Science (R0)