Abstract
In reconfigurable system, fast reconfiguration and small size of configuration contexts are strongly required to enhance the processing performance and reduce the implementation overhead. In this paper, a hierarchical representation of contexts for CGRA called HCC is proposed to satisfy the above requirements. In HCC, the contexts are constructed in a hierarchical fashion to thoroughly eliminate the repetitive portions of the contexts, not only reducing the overall contexts storage size, but also alleviating the contexts transportation overhead. The fast context-indexing mechanism is proposed in HCC to achieve high configuration speed, since the hierarchically organized contexts can be located and accessed conveniently. HCC has been verified in a reconfigurable processor called REMUS HP. Owing to HCC, when implementing H.264 decoding on REMUS HP, 76.67% of the overall contexts are reduced compared with the traditional non-hierarchical one; and the configuration speed is averagely 23× increased compared with the latest reported optimized configuration mechanism on Virtex-4 FX60. REMUS_HP is implemented on a 48.9 mm2 silicon with TSMC 65 nm technology. Simulation shows that 1920 × 1088@30 fps could be achieved for H.264 high-profile decoding when exploiting a 200 MHz working frequency. Compared with the high performance version of XPP, the performance is 181% boosted.
Similar content being viewed by others
References
Compton K, Hauck S. Reconfigurable computing: a survey of systems and software. ACM Comput Surv, 2002, 2: 171–210
Banerjee S, Bozorgzadeh E, Dutt N D. Integrating physical constraints in HW-SW partitioning for architectures with partial dynamic reconfiguration. IEEE Trans Very Large Scale Integr (VLSI) Syst, 2006, 14: 1189–1202
Suzuki M, Hasegawa Y, Tuan V M, et al. A cost-effective context memory structure for dynamically reconfigurable processors. In: International Conference on Parallel and Distributed Processing, Rhodes, 2006. 188–188
Lodi A, Mucci C, Bocchi M, et al. A multi-context pipelined array for embedded systems. In: International Conference on Field Programmable Logic and Applications, Madrid, 2006. 1–8
Sano T, Kato M, Tsutsumi S, et al. Instruction buffer mode for multi-context dynamically reconfigurable processors. In: International Conference on Field Programmable Logic and Applications, Heidelberg, 2008. 215–220
Rossi D, Campi F, Spolzino S, et al. A heterogeneous digital signal processor for dynamically reconfigurable computing. IEEE J Solid-State Circuit, 2010, 45: 1615–1626
Shield J, Sutton P, Machanick P. Dynamic cache switching in reconfigurable embedded systems. In: International Conference on Field Programmable Logic and Applications, Amsterdam, 2007. 111–116
Huang J, Lee J H. A self-reconfigurable platform for scalable DCT computation using compressed partial bitstreams and blockRAM prefetching. IEEE Trans Circ Syst Video Technol, 2009, 19: 1623–1632
Kim Y, Mahapatra R N. Dynamic context compression for low-power coarse-grained reconfigurable architecture. IEEE Trans Very Large Scale Integr (VLSI) Syst, 2010, 18: 15–28
Dandalis A, Prasanna V K. Configuration compression for FPGA-based embedded systems. IEEE Trans Very Large Scale Integr (VLSI) Syst, 2005, 13: 1394–1398
Hartenstein R. A decade of reconfigurable computing: a visionary retrospective. In: The Design, Automation and Test in Europe Conference, Munich, 2001. 642–649
Ganesan M K A, Singh S, May F, et al. H.264 decoder at HD resolution on a coarse grain dynamically reconfigurable architecture. In: International Conference on Field Programmable Logic and Applications, Amsterdam, 2007. 467–471
Campi F, Deledda A, Pizzotti M, et al. A dynamically adaptive DSP for heterogeneous reconfigurable platforms. In: The Design, Automation and Test in Europe Conference, Nice, 2007. 1–6
Mei B F, Veredas F J, Masschelein B. Mapping an H.264/AVC decoder onto the ADRES reconfigurable architecture. In: International Conference on Field Programmable Logic and Applications, Tampere, 2005. 622–625
Palkovic M, Cappelle H, Glassee M, et al. Mapping of 40 MHz MIMO SDM-OFDM baseband processing on multiprocessor SDR platform. In: 11th IEEE Workshop on Design and Diagnostics of Electronic Systems, Bratislava, 2008. 1–6
Garcia A, Berekovic M, Aa T V. Mapping of the AES cryptographic algorithm on a coarse-grain reconfigurable array processor. In: IEEE International Conference on Application-specific Systems, Architectures and Processors, Leuven, 2008. 245–250
Becker J, Vorbach M. Architecture, memory and interface technology integration of an industrial/academic configurable system-on-chip (CSoC). In: IEEE Computer Society Annual Symposium on VLSI, Tampa, 2003. 107–112
PACT. White paper of video decoding on XPP-III. Version 1.1.1. 2006
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Wang, Y., Liu, L., Yin, S. et al. Hierarchical representation of on-chip context to reduce reconfiguration time and implementation area for coarse-grained reconfigurable architecture. Sci. China Inf. Sci. 56, 1–20 (2013). https://doi.org/10.1007/s11432-013-4842-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11432-013-4842-5