skip to main content
research-article

How to efficiently implement dynamic circuit specialization systems

Published: 29 July 2013 Publication History

Abstract

Dynamic circuit specialization (DCS) is a technique used to implement FPGA applications where some of the input data, called parameters, change slowly compared to other inputs. Each time the parameter values change, the FPGA is reconfigured by a configuration that is specialized for those new parameter values. This specialized configuration is much smaller and faster than a regular configuration. However, the overhead associated with the specialization process should be minimized to achieve the desired benefits of using the DCS technique. This overhead is represented by both the FPGA resources needed to specialize the FPGA at runtime and by the specialization time. The introduction of parameterized configurations [Bruneel and Stroobandt 2008] has improved the efficiency of DCS implementations. However, the specialization overhead still takes a considerable amount of resources and time.
In this article, we explore how to efficiently build DCS systems by presenting a variety of possible solutions for the specialization process and the overhead associated with each of them. We split the specialization process into two main phases: the evaluation and the configuration phase. The PowerPC embedded processor, the MicroBlaze, and a customized processor (CP) are used as alternatives in the evaluation phase. In the configuration phase, the ICAP and a custom configuration interface (SRL configuration) are used as alternatives. Each solution is used to implement a DCS system for three applications: an adaptive finite impulse response (FIR) filter, a ternary content-addressable memory (TCAM), and a regular expression matcher (RegEx). The experiments show that the use of our CP along with the SRL configuration achieves minimum overhead in terms of resources and time. Our CP is 1.8 and 3.5 times smaller than the PowerPC and the area-optimized implementation of the MicroBlaze, respectively. Moreover, the use of the CP enables a more compact representation for the parameterized configuration in comparison to both the PowerPC and the MicroBlaze processors. For instance, in the FIR, the parameterized configuration compiled for our CP is 6--7 times smaller than that for the embedded processors.

References

[1]
Abouelella, F., Bruneel, K., and Stroobandt, D. 2010a. Efficiently generating FPGA configurations through a stack machine. In Proceedings of the International Conference on Field Programmable Logic and Applications.
[2]
Abouelella, F., Bruneel, K., and Stroobandt, D. 2010b. Towards a more efficient run-time FPGA configuration generation. In Parallel Computing: From Multicores and GPU's to Petascale. IOS Press, Amsterdam, 624--631.
[3]
Al Farisi, B., Bruneel, K., and Stroobandt, D. 2010. Automatic tool flow for shift-register-LUT reconfiguration: Making run-time reconfiguration fast and easy. In Proceedings of the ACM/SIGDA 18th International Symposium on Field Programmable Gate Arrays, P. Cheung and J. Wawrzynek, Eds. ACM, 287--287.
[4]
Altera. 2001. Application note 119: Implementing high-speed search applications with Altera CAM. Altera, San Jose, CA.
[5]
Altera. 2008. FPGA run-time reconfiguration: Two approaches. Altera, San Jose, CA.
[6]
Altera. 2009. Interfacing an external processor to an Altera FPGA. Altera, SanJose, CA.
[7]
Anderson, I. and Khalid, M. 2006. Soft-core processors for embedded systems. In Proceedings of the 18th International Conference on Microelectronics (ICM).
[8]
Baker, Z. K., jip Jung, H., and Prasanna, V. K. 2006. Regular expression software deceleration for intrusion detection systems. In Proceedings of the 16th International Conference on Field Programmable Logic and Applications (FPD'06).
[9]
Bernhart, F. and Kainen, P. C. 1979. The book thickness of a graph. J. Combin. Theory 27, 3, 320--331.
[10]
Bobda, C. 2007. Introduction to Reconfigurable Computing: Architectures, Algorithms, and Applications. Springer, Berlin Heidelberg.
[11]
Brodie, B., Taylor, D., and Cytron, R. 2006. A scalable architecture for high-throughput regular-expression pattern matching. ACM SIGARCH Comput. Architect. News 34, 2.
[12]
Bruneel, K., Heirman, W., and Stroobandt, D. 2011. Dynamic data folding with parameterizable configurations. ACM Trans. Des. Autom. Electron. Syst. 16, 4.
[13]
Bruneel, K. and Stroobandt, D. 2008. Automatic generation of run-time parameterizable configurations. In Proceedings of the International Conference on Field Programmable Logic and Applications, U. Kebschull, M. Platzner, and T. J., Eds. Kirchhoff Institute for Physics, Heidelberg, 361--366.
[14]
Chung, F. R. K., Leighton, F. T., and Rosenberg, A. L. 1987. Embedding graphs in books: A layout problem with applications to vlsi design. SIAM J. Algebra. Discrete Methods 8, 1, 33--58.
[15]
Ditech Networks. 2011. Echo basics tutorial. Ditech Networks, http://www.ditechnetworks.com.
[16]
Divyasree, J., Rajashekar, H., and Varghese, K. 2008. Dynamically reconfigurable regular expression matching architecture. In Proceedings of the International Conference on Application-Specific Systems, Architectures and Processors.
[17]
Fletcher, B. H. 2005. FPGA embedded processors. In Proceedings of the Embedded Training Program Embedded Systems Conference.
[18]
Foulk, P. 1993. Data-folding in SRAM configurable FPGAs. In Proceedings of the IEEE Workshop on FPGAs for Custom Computing Machines. 163--171.
[19]
Heath, L. S., Pemmaraju, S. V., and Trenk, A. N. 1999a. Stack and queue layouts of directed acyclic graphs: Part i. SIAM J. Comput. 28, 4, 1510--1539.
[20]
Heath, L. S. and Pemmaraju, S. V. 1999b. Stack and queue layouts of directed acyclic graphs: Part ii. SIAM J. Comput. 28, 5, 1588--1626.
[21]
IBM. 2006. IBM powerpc 405 embedded core. https://www-01.ibm.com/chips/techlib/techlib.nsf/techdocs/852569B20050FF778525699300651D97/file/PowerPC405_Nov2006.pdf.
[22]
Koopman, P. 1989. Stack Computers: The New Wave. E. Horwood, New York, NY.
[23]
Labovitz, C., Malan, G., and Jahanian, F. 1998. Internet routing instability. IEEE/ACM Trans. Netw. 6, 5, 515--528.
[24]
Merlier, M. 2011. Dynamisch herconfigureerbare partoonherkenning voor reguliere expressies op FPGA. M.S. thesis, Ghent University, Gent, Belgium.
[25]
Mishchenko, A. and Brayton, R. K. 2006. Scalable logic synthesis using a simple circuit structure. In Proceedings of the International Workshop on Logic & Synthesis (IWLS'06).
[26]
Pagiamtzis, K. and Sheikholeslami, A. 2006. Content-addressable memory (CAM) circuits and architectures: A tutorial and survey. IEEE J. Solid-State Circ. 41, 3, 712--727.
[27]
Prabhala, B. and Sethi, R. 1978. Efficient computation of expressions with common subexpressions. In Proceedings of the 5th ACM SIGACT-SIGPLAN Symposium on Principles of Programming Languages.
[28]
Schuck, C., Haetzer, B., and Becker, J. 2009. An interface for a decentralized 2D reconfiguration on Xilinx virtex-FPGAS for organic computing. Int. J. Reconfig. Comput. 11.
[29]
Sinha, V. and Kumar, K. V. 1978. Efficient evaluation of boolean expressions. SIGPLAN Not. 13, 12, 88--97.
[30]
Wirthlin, M. J. and Hutchings, B. L. 1997. Improving functional density through run-time constant propagation. In Proceedings of the ACM 5th International Symposium on Field-Programmable Gate Arrays (FPGA'97). ACM, New York, NY, 86--92.
[31]
Xilinx. 2010. Partial reconfiguration user guide. Xilinx, UG702: http://www.xilinx.com/tools/partial-reconfiguration.htm.
[32]
Xilinx. 2000. Benefits of using Xilinx FPGAs with MIPS microprocessors. Xilinx, http://www.xilinx.com/ipcenter/processor_central/wp121.pdf.
[33]
Xilinx. 2008. Virtex-II Pro libraries guide for HDL designs. Xilinx, http://www.xilinx.com/itp/xilinx 10/books/decs/vertex2p-hdl/virtex2p-hdl.pdf.
[34]
Yannakakis, M. 1986. Four pages are necessary and sufficient for planar graphs. In Proceedings of the 18th Annual ACM Symposium on Theory of Computing. ACM.

Cited By

View all
  • (2020)An Integrated Approach and Tool Support for the Design of FPGA-Based Multi-Grain Reconfigurable SystemsIEEE Access10.1109/ACCESS.2020.30365418(202133-202152)Online publication date: 2020
  • (2019)TPaRIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2013.229165933:3(370-383)Online publication date: 3-Jan-2019
  • (2018)Reconfigurable FPGA Implementation of the AVC Quantiser and De-quantiser BlocksAdvanced Concepts for Intelligent Vision Systems10.1007/978-3-030-01449-0_43(506-517)Online publication date: 25-Sep-2018
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Design Automation of Electronic Systems
ACM Transactions on Design Automation of Electronic Systems  Volume 18, Issue 3
July 2013
268 pages
ISSN:1084-4309
EISSN:1557-7309
DOI:10.1145/2491477
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Journal Family

Publication History

Published: 29 July 2013
Accepted: 01 November 2012
Revised: 01 June 2012
Received: 01 August 2011
Published in TODAES Volume 18, Issue 3

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Boolean Network evaluation
  2. FPGA
  3. dynamic circuit specialization
  4. runtime reconfiguration

Qualifiers

  • Research-article
  • Research
  • Refereed

Funding Sources

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)4
  • Downloads (Last 6 weeks)1
Reflects downloads up to 11 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2020)An Integrated Approach and Tool Support for the Design of FPGA-Based Multi-Grain Reconfigurable SystemsIEEE Access10.1109/ACCESS.2020.30365418(202133-202152)Online publication date: 2020
  • (2019)TPaRIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2013.229165933:3(370-383)Online publication date: 3-Jan-2019
  • (2018)Reconfigurable FPGA Implementation of the AVC Quantiser and De-quantiser BlocksAdvanced Concepts for Intelligent Vision Systems10.1007/978-3-030-01449-0_43(506-517)Online publication date: 25-Sep-2018
  • (2017)Less is more: Increasing the scope of hardware debugging with compression2017 Panhellenic Conference on Electronics and Telecommunications (PACET)10.1109/PACET.2017.8259958(1-4)Online publication date: Nov-2017
  • (2015)TCONMAPACM Transactions on Design Automation of Electronic Systems10.1145/275155820:4(1-27)Online publication date: 28-Sep-2015
  • (2015)Avoiding transitional effects in dynamic circuit specialisation on FPGAsProceedings of the 52nd Annual Design Automation Conference10.1145/2744769.2744802(1-6)Online publication date: 7-Jun-2015
  • (2015)Enabling FPGA routing configuration sharing in dynamic partial reconfigurationDesign Automation for Embedded Systems10.1007/s10617-014-9143-819:1-2(189-221)Online publication date: 1-Mar-2015
  • (2014)Performance Evaluation of Dynamic Circuit Specialization on Xilinx FPGAsProceedings of the FPGA World Conference 201410.1145/2674095.2674096(1-6)Online publication date: 9-Sep-2014

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media