Abstract
A partitioning methodology between the reconfigurable hardware blocks of different granularity, which are embedded in a generic heterogeneous architecture, is presented. The fine-grain reconfigurable logic is realized by an FPGA unit, while the coarse-grain reconfigurable hardware by a 2-Dimensional Array of Processing Elements. Critical parts, called kernels, are mapped on the coarse-grain reconfigurable logic for improving performance. The partitioning method is mainly composed by three steps: the analysis of the input code, the mapping onto the Coarse-Grain Reconfigurable Array and the mapping onto the FPGA. The partitioning flow is implemented by a prototype software framework. Analytical partitioning experiments, using five real-world applications, show that the execution time speedup relative to an all-FPGA solution ranges from 1.4 to 5.0.
Similar content being viewed by others
References
R. Hartenstein. A decade of reconfigurable computing: A visionary retrospective. Proc. of IEEE DATE ‘01, pp. 642–649, 2001.
R. Kastner, A. Kaplan, S.O. Memik, and E. Bozorgadeh. Instruction generation for hybrid reconfigurable systems. ACM Transactions on Design Automation of Electronic Systems(TODAES), 7(4):605–627, 2002.
M. Wan, H. Zhang, V. George, M. Benes, A. Abnous, V. Prabhu, and J. Rabaey. Design methodology of a low-energy reconfigurable single-chip DSP system. Journal of VLSI Signal Processing, 28(1/2):47–61, Springer, 2001.
G. K. Rauwerda, P. M. Heusters, and G. J. M. Smit. Mapping wireless communication algorithms onto a reconfigurable architecture. Journal of Supercomputing, Springer, 30(3):263–282, December 2004.
Virtex FPGAs and Xilinx Inc., www.xilinx.com, 2005.
Stratix FPGAs, Altera Inc., www.altera.com, 2005.
T. Miyamori and K. Olukutun. REMARC: Reconfigurable multimedia array coprocessor. IEICE Trans. on Information and Systems, E82-D(2): 389–397, Feb. 1999.
H. Singh, M.-H. Lee, G. Lu, F. J. Kurdahi, N. Bagherzadeh and E.M. Chaves Filho. MorphoSys: An integrated reconfigurable system for data-parallel and communication-intensive applications. IEEE Trans. on Computers, 49(5):465–481, 2000.
V. Baumgarte, G. Ehlers, F. May, A. Nuckel, M. Vorbach, and M. Weinhardt. PACT XPP-A self-reconfigurable data processing architecture. Journal of Supercomputing, Springer, 26(2):167–184, September 2003.
M-rDSP core, Morpho Technologies, www.morphotech.com, 2005.
K. Bazargan, R. Kastner, S. Ogrenci, and M. Sarrafzadeh. A C to hardware/software compiler. In Proc. of IEEE FCCM ’00, pp. 331–332, 2000.
J. Villareal, D. Suresh, G. Stitt, F. Vahid, and W. Najjar. Improving software performance with configurable logic. Design Automation for Embedded Systems (DAES), Springer, 7:325–339, 2002.
G. Stitt, F. Vahid, and S. Nematbakhsh. Energy savings and speedups from partitioning critical software loops to hardware in embedded systems. ACM Trans. on Embedded Computing Systems (TECS), 3(1): 218–232, 2004.
D. D. Gajski, F. Vahid, S. Narayan, and J. Gong. SpecSyn: An environment supporting the specify-explore-refine paradigm for hardware/software system design. IEEE Trans. on VLSI Syst., 6(1):84–100, March 1998.
J. Henkel. A low power hardware/software partitioning approach for core-based embedded systems. In Proc. of the 36th ACM DAC, pp. 122–127, 1999.
B. Mei, S. Vernalde, D. Verkest, H. De Man, and R. Lauwereins. Exploiting loop-level parallelism on coarse-grained reconfigurable architectures using modulo scheduling. In Proc. of IEEE DATE ′03, pp. 255–261, 2003.
N. Bansal, S. Gupta, N. Dutt, A. Nikolau, and R. Gupta. Network topology exploration of mesh-based coarse-grain reconfigurable architectures. In Proc. of IEEE DATE ’04, pp. 474–479, 2004.
K. Bazargan, S. Ogrenci, and M. Sarrafzadeh. Integrating scheduling and physical design into a coherent compilation cycle for reconfigurable computing architectures. In Proc. of ACM DAC ’01, pp. 635–640, 2001.
Z. Guo, B. Buyukkurt, W. Najjar, and K. Vissers. Optimized generation of data-path from c codes for fpgas. In Proc. of IEEE DATE ’05, Munich, Germany, pp. 112–117, 2005.
MachineSUIF compiler, http://www.eecs.harvard.edu/hube/research/machsuif.html, 2005.
G. De Micheli. Synthesis and Optimization of Digital Circuits, McGraw-Hill, 1994.
M. Motomura, Y. Aimoto, A. Shibayama, Y. Yabe, and M. Yamashina. An embedded dram-fpga chip with instantaneous logic reconfiguration. In Proc. of IEEE FCCM, pp. 264–266, 1998.
T.J. Callahan, J. R. Hauser, and J. Wawrzynek. The garp architecture and c compiler. IEEE Computer, 33(4):62–69, 2000.
R. D. Hudson, D. I. Lehn, and P. M. Athanas. A run-time reconfigurable engine for image interpolation. In Proc. of 6th IEEE FCCM ‘98, California, USA, April 15–17, pp. 88–95, 1998.
M. Kaul, R. Vemuri, S. Govindarajan, and I. Ouassis. An automated temporal partitioning tool for a class of dsp applications. In Proc. of PACT ’98, pp. 22–27, 1998.
SUIF2 compiler, http://suif.stanford.edu/suif/suif2/, 2005.
S. Kumar, L. Pires, S. Ponnuswamy, C. Nanavati, J. Golusky, M. Vojta, S. Wadi, D. Pandalai, and H. Spaanenberg. A benchmark suite for evaluating configurable computing systems - status, reflections, and future directions. In Proc. of FPGA, pp. 126–134, 2000.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Galanis, M.D., Dimitroulakos, G. & Goutis, C.E. Partitioning Methodology for Heterogeneous Reconfigurable Functional Units. J Supercomput 38, 17–34 (2006). https://doi.org/10.1007/s11227-006-6743-5
Issue Date:
DOI: https://doi.org/10.1007/s11227-006-6743-5