Abstract
Configuration overhead is a major performance bottleneck of the partial reconfiguration process. In this paper, we propose a combination of two techniques to minimize the partial reconfiguration performance overhead. First, we design and implement fully streaming DMA engines to achieve a near perfect configuration throughput. Second, we exploit the configuration data redundancy through Run-Length Encoding to compress the configuration bitstreams, and we implement an intelligent ICAP (Internal Configuration Access Port) controller to perform decompression at runtime. The results show that our design achieve an effective configuration data transfer throughput that well surpasses the upper bound of data transfer throughput, 400 Mbytes/s. Specifically, our fully stream DMA engines reduce the configuration time from the range of seconds to the range of milliseconds, a more than 1000-fold improvement. In addition, our simple compression scheme achieves significant reduction of bitstream size and results in a decompression circuit with negligible hardware overhead.
Similar content being viewed by others
References
Liu M, Kuehn W, Lu Z, Jantsch A (2009) Run-time partial reconfiguration speed investigation and architectural design space exploration. In: Proceedings of IEEE international conference on field programmable logic and applications
Claus C, Zhang B, Stechele W, Braun L, Hubner M, Becker J (2008) A multi-platform controller allowing for maximum dynamic partial reconfiguration throughput. In: Proceedings of IEEE international conference on field programmable logic and applications
Paulsson K, Hubner M, Auer G, Dreschmann M, Chen L, Becker J (2007) Implementation of a virtual internal configuration access port (JCAP) for enabling partial self-reconfiguration on Xilinx Spartan II FPGAs. In: Proceedings of international conference on field programmable logic and applications
Virtex-4 FPGA User Guide: http://www.xilinx.com/support/documentation/user_guides/ug070.pdf
Li Z, Hauck S (2001) Configuration compression for Virtex FPGAs. In: Proceedings of the IEEE symposium on field-programmable custom computing machines
Dandalis A, Prasanna VK (2001) Configuration compression for FPGA-based embedded systems. In: Proceedings of ACM/SIGDA symposium on field-programmable gate arrays
He L, Mitra T, Wong W-F (2004) Configuration bitstream compression for dynamically reconfigurable FPGAs. In: Proceedings of the IEEE/ACM international conference on computer-aided design
Resano J, Mozos D, Catthoor F (2005) A hybrid prefetch scheduling heuristic to minimize at run-time the reconfiguration overhead of dynamically reconfigurable hardware. In: Proceedings of the conference on design, automation and test in Europe
Li Z, Hauck S (2002) Configuration prefetching techniques for partial reconfigurable coprocessor with relocation and defragmentation. In: Proceedings of ACM/SIGDA symposium on field-programmable gate arrays
Carver J, Pittman RN, Forin A (2008) Relocation and automatic floor-planning of FPGA partial configuration bit-streams, MSR-TR-2008-111. Microsoft Research, WA, August 2008
Thomas DB, Luk W (2008) Multivariate Gaussian random number generation targeting reconfigurable hardware. ACM Trans Reconfigurable Technol Syst 1(2) June
Virtex-4 FPGA Configuration User Guide: http://www.xilinx.com/support/documentation/user_guides/ug071.pdf
Liu S, Pittman RN, Forin A, Gaudiot J-L (2010) On energy efficiency of reconfigurable systems with run-time partial reconfiguration. In: Proceedings of the 21st IEEE international conference on application-specific systems, architectures, and processors (ASAP 2010)
Chiu J-C, Chou Y-L, Lin R-B (2008) The multi-context reconfigurable processing unit for fine-grain computing. J Inf Sci Eng 24(3):965–979
Heiner J, Sellers B, Wirthlin M, Kalb J (2009) FPGA partial reconfiguration via configuration scrubbing. In: Proceedings of the 11th international workshop on field-programmable logic and applications
Tang J, Liu S, Gu Z, Li X-F, Gaudiot J-L (2010) Achieving middleware execution efficiency: hardware-assisted garbage collection operations, J Supercomput. doi:10.1007/s11227-010-0493-0, November, 2010
Liu M, Lu Z, Kuehn W, Jantsch A (2010) Inter-process communications using pipes in FPGA-based adaptive computing. In: Proceedings of the IEEE computer society annual symposium on VLSI (ISVLSI’10), Lixouri Kefalonia, Greece, Jul. 2010
Tang J, Liu S, Gu Z, Li X-F, Gaudiot J-L (2010) Hardware-assisted middleware: acceleration of garbage collection operations. In: Proceedings of the 21st IEEE international conference on application-specific systems, architectures, and processors (ASAP 2010), Rennes, France, 7–9 July, 2010
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Liu, S., Pittman, R.N., Forin, A. et al. Minimizing the runtime partial reconfiguration overheads in reconfigurable systems. J Supercomput 61, 894–911 (2012). https://doi.org/10.1007/s11227-011-0657-6
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-011-0657-6