research-article

Compiling for reconfigurable computing: A survey

Authors:

João M. P. Cardoso,

Pedro C. Diniz,

Markus WeinhardtAuthors Info & Claims

ACM Computing Surveys (CSUR), Volume 42, Issue 4

Article No.: 13, Pages 1 - 65

https://doi.org/10.1145/1749603.1749604

Published: 23 June 2010 Publication History

Abstract

Reconfigurable computing platforms offer the promise of substantially accelerating computations through the concurrent nature of hardware structures and the ability of these architectures for hardware customization. Effectively programming such reconfigurable architectures, however, is an extremely cumbersome and error-prone process, as it requires programmers to assume the role of hardware designers while mastering hardware description languages, thus limiting the acceptance and dissemination of this promising technology. To address this problem, researchers have developed numerous approaches at both the programming languages as well as the compilation levels, to offer high-level programming abstractions that would allow programmers to easily map applications to reconfigurable architectures. This survey describes the major research efforts on compilation techniques for reconfigurable computing architectures. The survey focuses on efforts that map computations written in imperative programming languages to reconfigurable architectures and identifies the main compilation and synthesis techniques used in this mapping.

References

[1]

Aamodt, T. and Chow, P. 2000. Embedded ISA support for enhanced floating-point to fixed-point ANSI-C compilation. In Proceedings of the International Conference on Compilers, Architecture, and Synthesis for Embedded Systems (CASES'00). ACM, New York, 128--137.

Digital Library

[2]

Accelchip. http://www.accelchip.com/.

[3]

Agarwal, L., Wazlowski, M., and Ghosh, S. 1994. An asynchronous approach to efficient execution of programs on adaptive architectures utilizing FPGAs. In Proceedings of the 2nd IEEE Workshop on FPGAs for Custom Computing Machines (FCCM'94). IEEE, Los Alamitos, CA, 101--110.

[4]

Allen, J. R., Kennedy, K., Porterfield, C., and Warren, J. 1983. Conversion of control dependence to data dependence. In Proceedings of the 10th ACM SIGACT-SIGPLAN Symposium on Principles of Programming Languages (POPL'83). ACM, New York, 177--189.

Digital Library

[5]

Altera Inc. http://www.altera.com/.

[6]

Altera Inc. 2002. Stratix programmable logic device family data sheet 1.0, H.W.A.C. Altera Corp.

[7]

Amerson, R., Carter, R. J., Culbertson, W. B., Kuekes, P., and Snider, G. 1995. Teramac-configurable custom computing. In Proceedings of the IEEE Symposium on FPGAs for Custom Computing Machines (FCCM'95). IEEE, Los Alamitos, CA, 32--38.

Digital Library

[8]

Annapolis Microsystems Inc. 1999. WildStarTM reconfigurable computing engines, User's manual R3.3.

[9]

Athanas, P. 1992. An Adaptive Machine Architecture and Compiler for Dynamic Processor Reconfiguration. Brown University.

Digital Library

[10]

Athanas, P. and Silverman, H. 1993. Processor reconfiguration through instruction-set metamorphosis: Architecture and compiler. Computer 26, 3, 11--18.

Digital Library

[11]

August, D. I., Sias, J. W., Puiatti, J.-M., Mahlke, S. A., Connors, D. A., Crozier, K. M., and Hwu, W.-M. W. 1999. The program decision logic approach to predicated execution. In Proceedings of the 26th Annual International Symposium on Computer Architecture (ISCA'99). IEEE, Los Alamitos, CA, 208--219.

Digital Library

[12]

Babb, J. 2000. High-level compilation for reconfigurable architectures, Ph.D. dissertation, Department of Electrical Engineering and Computer Science, MIT, Cambridge, MA.

Digital Library

[13]

Babb, J., Rinard, M., Moritz, C. A., Lee, W., Frank, M., Barua, R., and Amarasinghe, S. 1999. Parallelizing applications into silicon. In Proceedings of the 7th Annual IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM'99). IEEE, Los Alamitos, CA, 70--81.

Digital Library

[14]

Banerjee, P., Shenoy, N., Choudhary, A., Hauck, S., Bachmann, C., Haldar, M., Joisha, P., Jones, A., Kanhare, A., Nayak, A., Periyacheri, S., Walkden, M., and Zaretsky, D. 2000. A MATLAB compiler for distributed, heterogeneous, reconfigurable computing systems. In Proceedings of the IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM'00). IEEE, Los Alamitos, CA, 39--48.

Digital Library

[15]

Baradaran, N. and Diniz, P. 2006. Memory parallelism using custom array mapping to heterogeneous storage structures. In Proceedings of the International Conference on Field Programmable Logic and Applications (FPL'06). IEEE, Los Alamitos, CA, 383--388.

[16]

Baradaran, N., Park, J., and Diniz, P. 2004. Compiler reuse analysis for the mapping of data in FPGAs with RAM blocks. In Proceedings of the IEEE International Conference on Field-Programmable Technology (FPT'04). IEEE, Los Alamitos, CA, 45--152.

[17]

Barua, R., Lee, W., Amarasinghe, S., and Agarwal, A. 2001. Compiler support for scalable and efficient memory systems. IEEE Trans. Computers 50, 11, 1234--1247.

Digital Library

[18]

Baumgarte, V., Ehlers, G., May, F., Ckel, A, N., Vorbach, M., and Weinhardt, M. 2003. PACT XPP: A self-reconfigurable data processing architecture. J. Supercomput. 26, 2, 167--184.

Digital Library

[19]

Beck, G., Yen, D., and Anderson, T. 1993. The Cydra 5 minisupercomputer: Architecture and implementation. J. Supercomput. 7, 1--2, 143--180.

Digital Library

[20]

Becker, J., Hartenstein, R., Herz, M., and Nageldinger, U. 1998. Parallelization in co-compilation for configurable accelerators. In Proceedings of the Asia South Pacific Design Automation Conference (ASP-DAC'98), 23--33.

[21]

Bellows, P. and Hutchings, B. 1998. JHDL-An HDL for reconfigurable systems. In Proceedings of the IEEE 6th Symposium on Field-Programmable Custom Computing Machines (FCCM'98). IEEE, Los Alamitos, CA, 175--184.

Digital Library

[22]

Bernstein, R. 1986. Multiplication by Integer constants. Softw.Pract. Exper. 16, 7, 641--652.

Digital Library

[23]

Bjesse, P., Claessen, K., Sheeran, M., and Singh, S. 1998. Lava: Hardware design in Haskell. In Proceedings of the 3rd ACM SIGPLAN International Conference on Functional Programming (ICFP'98). ACM, New York, 174--184.

Digital Library

[24]

Böhm, W., Hammes, J., Draper, B., Chawathe, M., Ross, C., Rinker, R., and Najjar, W. 2002. Mapping a single assignment programming language to reconfigurable systems. J. Supercomput. 21, 2, 117--130.

Digital Library

[25]

Böhm, A. P. W., Draper, B., Najjar, W., Hammes, J., Rinker, R., Chawathe, M., and Ross, C. 2001. One-step compilation of image processing algorithms to FPGAs. In Proceedings of the IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM'01). IEEE, Los Alamitos, CA, 209--218.

Digital Library

[26]

Bondalapati, K. 2001. Parallelizing of DSP nested loops on reconfigurable architectures using data context switching. In Proceedings of the IEEE/ACM 38th Design Automation Conference (DAC'01). ACM, New York, 273--276.

Digital Library

[27]

Bondalapati, K., Diniz, P., Duncan, P., Granacki, J., Hall, M., Jain, R., and Ziegler, H. 1999. DEFACTO: A design environment for adaptive computing technology. In Proceedings of the 6th Reconfigurable Architectures Workshop (RAW'99). Lecture Notes in Computer Science, vol. 1586, Springer, Berlin, 570--578.

Digital Library

[28]

Bondalapati, K. and Prasanna, V. K. 1999. Dynamic precision management for loop computations on reconfigurable architectures. In Proceedings of the 7th Annual IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM'99). IEEE, Los Alamitos, CA, 249--258.

Digital Library

[29]

Brasen, D. R. and Saucier, G. 1998. Using cone structures for circuit partitioning into FPGA packages. IEEE Trans. Comput.-Aid. Des. Integrt. Circuits Syst. 17, 7, 592--600.

Digital Library

[30]

Brooks, D. and Martonosi, M. 1999. Dynamically exploiting narrow width operands to improve processor power and performance. In Proceedings of the 5th International Symposium on High Performance Computer Architecture (HPCA'99). IEEE, Los Alamitos, CA, 13--22.

Digital Library

[31]

Budiu, M., Goldstein, S., Sakr, M., and Walker, K. 2000. BitValue inference: Detecting and exploiting narrow bit-width computations. In Proceedings of the 6th International European Conference on Parallel Computing (EuroPar'00). Lecture Notes in Computer Science, vol. 1900, Springer, Berlin, 969--979.

Digital Library

[32]

Budiu, M. and Goldstein, S. C. 1999. Fast compilation for pipelined reconfigurable fabrics. In Proceedings of the ACM/SIGDA 7th International Symposium on Field Programmable Gate Arrays (FPGA'99). ACM, New York, 195--205.

Digital Library

[33]

Cadambi, S. and Goldstein, S. 2000. Efficient place and route for pipeline reconfigurable architectures. In Proceedings of the IEEE International Conference on Computer Design: VLSI in Computers & Processors (ICCD'00). IEEE, Los Alamitos, CA, 423--429.

Digital Library

[34]

Callahan, T. J. 2002. Automatic compilation of C for hybrid reconfigurable architectures. Ph.D. thesis, University of California, Berkeley.

Digital Library

[35]

Callahan, T. J., Hauser, J. R., and Wawrzynek, J. 2000. The Garp architecture and C compiler. Computer 33, 4, 62--69.

Digital Library

[36]

Callahan, T. J. and Wawrzynek, J. 2000. Adapting software pipelining for reconfigurable computing. In Proceedings of the International Conference on Compilers, Architecture, and Synthesis for Embedded Systems (CASES'00). ACM, New York, 57--64.

Digital Library

[37]

Callahan, T. J. and Wawrzynek, J. 1998. Instruction-level parallelism for reconfigurable computing. In Proceedings of the 8th International Workshop on Field-Programmable Logic and Applications (FPL'98). Lecture Notes in Computer Science, vol. 1482, Springer, Berlin, 248--257.

Digital Library

[38]

Callahan, T. J., Chong, P., Dehon, A., and Wawrzynek, J. 1998. Fast module mapping and placement for data-paths in FPGAs. In Proceedings of the ACM 6th International Symposium on Field Programmable Gate Arrays (FPGA'98). ACM, New York, 123--132.

Digital Library

[39]

Cardoso, J. M. P. 2003. On combining temporal partitioning and sharing of functional units in compilation for reconfigurable architectures. IEEE Trans. Comput. 52, 10, 1362--1375.

Digital Library

[40]

Cardoso, J. M. P. and Neto, H. C. 2003. Compilation for FPGA-based reconfigurable hardware. IEEE Des. Test Comput. Mag. 20, 2, 65--75.

Digital Library

[41]

Cardoso, J. M. P. and Weinhardt, M. 2002. XPP-VC: A C compiler with temporal partitioning for the PACT-XPP architecture. In Proceedings of the 12th International Conference on Field-Programmable Logic and Applications (FPL'02). Lecture Notes in Computer Science, Springer, Berlin, 864--874.

Digital Library

[42]

Cardoso, J. M. P. and Neto, H. C. 2001. Compilation increasing the scheduling scope for multi-memory-FPGA-based custom computing machines. In Proceedings of the 11th International Conference on Field Programmable Logic and Applications (FPL'01). Lecture Notes in Computer Science, vol. 2147, Springer, Berlin, 523--533.

Digital Library

[43]

Cardoso, J. M. P. and Neto, H. C. 2000. An enhanced static-list scheduling algorithm for temporal partitioning onto RPUs. In Proceedings of the IFIP TC10/WG10.5 10th International Conference on Very Large Scale Integration (VLSI'99).

Digital Library

[44]

Cardoso, J. M. P. and Neto, H. C. 1999. Macro-based hardware compilation of Java bytecodes into a dynamic reconfigurable computing system. In Proceedings of the IEEE 7th Symposium on Field-Programmable Custom Computing Machines (FCCM'99). IEEE, Los Alamitos, CA, 2--11.

Digital Library

[45]

Caspi, E. 2000. Empirical study of opportunities for bit-level specialization in word-based programs, Tech. rep., University of California Berkeley.

Digital Library

[46]

Caspi, E., Chu, M., Randy, H., Yeh, J., Wawrzynek, J., and Dehon, A. 2000. Stream computations organized for reconfigurable execution (SCORE). In Proceedings of the 10th International Workshop on Field-Programmable Logic and Applications (FPL'00). Lecture Notes in Computer Science, vol. 1896, Springer, Berlin, 605--614.

Digital Library

[47]

Celoxica Ltd. http://www.celoxica.com/.

[48]

Compton, K. and Hauck, S. 2002. Reconfigurable computing: a survey of systems and software. ACM Comput. Surv. 34, 2, 171--210.

Digital Library

[49]

Cronquist, D. C., Franklin, P., Berg, S. G., and Ebeling, C. 1998. Specifying and compiling applications for RaPiD. In Proceedings of the IEEE Symposium on FPGAs for Custom Computing Machines (FCCM'98). IEEE, Los Alamitos, CA, 116--125.

Digital Library

[50]

Cytron, R., Ferrante, J., Rosen, B. K., Wegman, M. N., and Zadeck, F. K. 1991. Efficiently computing static single assignment form and the control dependence graph. ACM Trans. Program. Lang. Syst. 13, 4, 451--490.

Digital Library

[51]

Dehon, A., Markovskya, Y., Caspia, E., Chua, M., Huanga, R., Perissakisa, S., Pozzi, L., Yeha, J., and Wawrzyneka, J. 2006. Stream computations organized for reconfigurable execution. Microprocess. Microsyst. 30, 6, 334--354.

[52]

Dehon, A. 2000. The density advantage of configurable computing. Computer 33, 4, 41--49.

Digital Library

[53]

Dehon, A. 1996. Reconfigurable architectures for general-purpose computing. Tech. rep. MIT, Cambridge, MA.

Digital Library

[54]

Diniz, P. C. 2005. Evaluation of code generation strategies for scalar replaced codes in fine-grain configurable architectures. In Proceedings of the 13th Annual IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM'05). IEEE, Los Alamitos, CA, 73--82.

Digital Library

[55]

Diniz, P. C., Hall, M. W., Park, J., So, B., and Ziegler, H. E. 2001. Bridging the gap between compilation and synthesis in the DEFACTO system. In Proceedings of the 14th Workshop on Languages and Compilers for Parallel Computing (LCPC'01). Lecture Notes in Computer Science, vol. 2624, Springer, Berlin, 2003, 52--70.

Digital Library

[56]

Doncev, G., Leeser, M., and Tarafdar, S. 1998. High level synthesis for designing custom computing hardware. In Proceedings of the IEEE Symposium on FPGAs for Custom Computing Machines (FCCM'98). IEEE, Los Alamitos, CA, 326--327.

Digital Library

[57]

Duncan, A. A., Hendry, D. C., and Cray, P. 2001. The COBRA-ABS high level synthesis system for multi-FPGA custom computing machines. IEEE Trans. VLSI Syst. 9, 1, 218--223.

Digital Library

[58]

Duncan, A. A., Hendry, D.C., and Cray, P. 1998. An overview of the COBRA-ABS high level synthesis system for multi-FPGA systems. In Proceedings of the IEEE Symposium on FPGAs for Custom Computing Machines (FCCM'98). IEEE, Los Alamitos, CA, 106--115.

Digital Library

[59]

Ebeling, C., Cronquist, D. C., and Franklin, P. 1995. RaPiD—Reconfigurable pipelined datapath. In Proceedings of the 6th International Workshop on Field-Programmable Logic and Applications (FPL'95). Lecture Notes in Computer Science, vol. 975, Springer, Berlin, 126--135.

Digital Library

[60]

Edwards, S. 2002. High-level synthesis from the synchronous language Esterel. In Proceedings of the 11th IEEE/ACM International Workshop on Logic and Synthesis (IWLS'02). 401--406.

[61]

Fekete, S., Köhler, E., and Teich, J. 2001. Optimal FPGA module placement with temporal precedence constraints. In Proceedings of the IEEE/ACM Design Automation and Test in Europe Conference and Exhibition (DATE'01). 658--665.

Digital Library

[62]

Xilinx, Inc. Forge. Forge compiler. http://www.lavalogic.com/.

[63]

Frigo, J., Gokhale, M., and Lavenier, D. 2001. Evaluation of the Streams-C C-to-FPGA compiler: An applications perspective. In Proceedings of the ACM 9th International Symposium on Field-Programmable Gate Arrays (FPGA'01). ACM, New York, 134--140.

Digital Library

[64]

Fujii, T., Furuta, K., Motomura, M., Nomura, M., Mizuno, M., Anjo, K., Wakabayashi, K., Hirota, Y., Nakazawa, Y., Ito, H., and Yamashina, M. 1999. A dynamically reconfigurable logic engine with a multi-context/multi-mode unified-cell architecture. In Proceedings of the IEEE International Solid State Circuits Conference (ISSCC'99). IEEE, Los Alamitos, CA, 364--365.

[65]

Gajski, D. D., Dutt, N. D., Wu, A. C. H., and Lin, S. Y. L. 1992. High-level Synthesis: Introduction to Chip and System Design. Kluwer, Amsterdam.

Digital Library

[66]

Galloway, D. 1995. The transmogrifier C hardware description language and compiler for FPGAs. In Proceedings of the 3rd IEEE Workshop on FPGAs for Custom Computing Machines (FCCM'95). IEEE, Los Alamitos, CA, 136--144.

Digital Library

[67]

Ganesan, S. and Vemuri, R. 2000. An integrated temporal partitioning and partial reconfiguration technique for design latency improvement. In Proceedings of the Design, Automation and Test in Europe Conference and Exhibition (DATE'00). IEEE, Los Alamitos, CA, 320--325.

Digital Library

[68]

Girkar, M. and Polychronopoulos, C. D. 1992. Automatic extraction of functional parallelism from ordinary programs. IEEE Trans. Parall. Distrib. Syst. 3, 2, 166--178.

Digital Library

[69]

Gokhale, M. and Graham, P. S. 2005. Reconfigurable Computing: Accelerating Computation with Field-Programmable Gate Arrays. Springer, Berlin.

Digital Library

[70]

Gokhale, M., Stone, J. M., and Gomersall, E. 2000a. Co-synthesis to a hybrid RISC/FPGA architecture. J. VLSI Signal Process. Syst. Signal, Image Video Technol. 24, 2, 165--180.

Digital Library

[71]

Gokhale, M., Stone, J. M., Arnold, J., and Kalinowski, M. 2000b. Stream-oriented FPGA computing in the Streams-C high level language. In Proceedings of the IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM'00). IEEE, Los Alamitos, CA, 49--56.

Digital Library

[72]

Gokhale, M. and Stone, J. 1999. Automatic allocation of arrays to memories in FPGA processors with multiple memory banks. In Proceedings of the 7th IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM'99). IEEE, Los Alamitos, CA, 63--69.

Digital Library

[73]

Gokhale, M. and Stone, J. M. 1998. NAPA C: Compiling for a hybrid RISC/FPGA architecture. In Proceedings of the IEEE Symposium on FPGAs for Custom Computing Machines (FCCM). IEEE, Los Alamitos, CA, 126--135.

Digital Library

[74]

Gokhale, M. and Gomersall, E. 1997. High-level compilation for fine grain FPGAs. In Proceedings of the IEEE 5th Symposium on Field-Programmable Custom Computing Machines (FCCM'97). IEEE, Los Alamitos, CA, 165--173.

Digital Library

[75]

Gokhale, M. and Marks, A. 1995. Automatic synthesis of parallel programs targeted to dynamically reconfigurable logic. In Proceedings of the 5th International Workshop on Field Programmable Logic and Applications (FPL'95). Lecture Notes in Computer Science, vol. 975, Springer, Berlin, 399--408.

Digital Library

[76]

Gokhale, M. and Carlson, W. 1992. An introduction to compilation issues for parallel machines. J. Supercomput. 283--314.

[77]

Gokhale, M., Holmes, W., Kopser, A., Kunze, D., Lopresti, D. P., Lucas, S., Minnich, R., and Olsen, P. 1990. SPLASH: A reconfigurable linear logic array. In Proceedings of the International Conference on Parallel Processing (ICPP'90). 526--532.

[78]

Goldstein, S. C., Schmit, H., Budiu, M., Cadambi, S., Moe, M., and Taylor, R. R. 2000. PipeRench: A reconfigurable architecture and compiler. Computer 33, 4, 70--77.

Digital Library

[79]

Goldstein, S. C., Schmit, H., Moe, M., Budiu, M., Cadambi, S., Taylor, R. R., and Laufer, R. 1999. PipeRench: A coprocessor for streaming multimedia acceleration. In Proceedings of the 26th Annual International Symposium on Computer Architecture (ISCA'99). IEEE, Los Alamitos, CA, 28--39.

Digital Library

[80]

Gonzalez, R. E. 2000. Xtensa: A configurable and extensible processor. IEEE Micro 20, 2, 60--70.

Digital Library

[81]

Guccione, S., Levi, D., and Sundararajan, P. 2000. Jbits: Java based interface for reconfigurable computing. In Proceedings of the Military and Aerospace Applications of Programmable Devices and Technologies Conference (MAPLD'00). 1--9.

[82]

Guo, Z. and Najjar, W. 2006. A compiler intermediate representation for reconfigurable fabrics. In Proceedings of the 16th International Conference on Field Programmable Logic and Applications (FPL'2006). IEEE, Los Alamitos, CA, 741--744.

[83]

Guo, Z., Buyukkurt, A. B., and Najjar, W. 2004. Input data reuse in compiling Window operations onto reconfigurable hardware. In Proceedings of the ACM Symposium on Languages, Compilers and Tools for Embedded Systems (LCTES'04). ACM SIGPLAN Not. 39, 7, 249--256.

Digital Library

[84]

Gupta, S., Savoiu, N., Kim, S., Dutt, N., Gupta, R., and Nicolau, A. 2001. Speculation techniques for high-level synthesis of control intensive designs. In Proceedings of the 38th IEEE/ACM Design Automation Conference (DAC'01). ACM, New York, 269--272.

Digital Library

[85]

Haldar, M., Nayak, A., Choudhary, A., and Banerjee, P. 2001a. A system for synthesizing optimized FPGA hardware from MATLAB. In Proceedings of the IEEE/ACM International Conference on Computer-Aided Design (ICCAD'01). IEEE, Los Alamitos, CA, 314--319.

Digital Library

[86]

Haldar, M., Nayak, A., Shenoy, N., Choudhary, A., and Banerjee, P. 2001b. FPGA hardware synthesis from MATLAB. In Proceedings of the 14th International Conference on VLSI Design (VLSID'01). 299--304.

Digital Library

[87]

Hartenstein, R. W. 2001. A decade of reconfigurable computing: A visionary retrospective. In Proceedings of the Conference on Design, Automation and Test in Europe (DATE). IEEE, Los Alamitos, CA, 642--649.

Digital Library

[88]

Hartenstein, R. W. 1997. The microprocessor is no more general purpose: Why future reconfigurable platforms will win. In Proceedings of the International Conference on Innovative Systems in Silicon (ISIS'97).

[89]

Hartenstein, R. W., Becker, J., Kress, R., and Reinig, H. 1996. High-performance computing using a reconfigurable accelerator. Concurrency—Pract. Exper. 8, 6, 429--443.

[90]

Hartenstein, R. W. and Kress, R. 1995. A datapath synthesis system for the reconfigurable datapath architecture. In Procedings of the Asia and South Pacific Design Automation Conference (ASP-DAC'95). ACM, New York, 479--484.

Digital Library

[91]

Hartley, R. 1991. Optimization of canonic signed digit multipliers for filter design. In Proceedings of the IEEE International Sympoisum on Circuits and Systems (ISCA'91). IEEE, Los Alamitos, CA, 343--348.

[92]

Hauck, S., Fry, T. W., Hosler, M. M., and Kao, J. P. 2004. The Chimaera reconfigurable functional unit. IEEE Trans. VLSI Syst. 12, 2, 206--217.

Digital Library

[93]

Hauck, S. 1998. The roles of FPGAs in reprogrammable systems. Proc. IEEE 86, 4, 615--638.

[94]

Hauser, J. R. and Wawrzynek, J. 1997. Garp: A MIPS processor with a reconfigurable coprocessor. In Proceedings of the 5th IEEE Symposium on FPGA-Based Custom Computing Machines (FCCM'97). IEEE, Los Alamitos, CA, 12--21.

Digital Library

[95]

Hoare, C. A. R. 1978. Communicating sequential processes. Comm. ACM 21, 8, 666--677.

Digital Library

[96]

Impact. The Impact Research Group. http://www.crhc.uiuc.edu/.

[97]

Impulse-Accelerated-Technologies Inc. http://www.impulsec.com/.

[98]

Inoue, A., Tomiyama, H., Okuma, H., Kanbara, H., and Yasuura, H. 1998. Language and compiler for optimizing datapath widths of embedded systems. IEICE Trans. Fundamentals E81-A, 12, 2595--2604.

[99]

Iseli, C. and Sanchez, E. 1993. Spyder: A reconfigurable VLIW processor using FPGAs. In Proceedings of the IEEE Workshop on FPGAs for Custom Computing Machines (FCCM'93). IEEE, Los Alamitos, CA, 17--24.

[100]

Jones, M., Scharf, L., Scott, J., Twaddle, C., Yaconis, M., Yao, K., Athanas, P., and Schott, B. 1999. Implementing an API for distributed adaptive computing systems. In Proceedings of the IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM'99). IEEE, Los Alamitos, CA, 222--230.

Digital Library

[101]

Jong, G. D., Verdonck, B. L. C., Wuytack, S., and Catthoor, F. 1995. Background memory management for dynamic data structure intensive processing systems. In Proceedings of the IEEE/ACM International Conference on Computer-Aided Design (ICCAD'95). IEEE, Los Alamitos, CA, 515--520.

Digital Library

[102]

Kastrup, B., Bink, A., and Hoogerbrugge, J. 1999. ConCISe: A compiler-driven CPLD-based instruction set accelerator. In Proceedings of the Seventh Annual IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM'99). IEEE, Los Alamitos, CA, 92--101.

Digital Library

[103]

Kaul, M. and Vemuri, R. 1999. Temporal partitioning combined with design space exploration for latency minimization of run-time reconfigured designs. In Proceedings of the Conference on Design, Automation and Test in Europe (DATE'99). 202--209.

Digital Library

[104]

Kaul, M., Vemuri, R., Govindarajan, S., and Ouaiss, I. 1999. An automated temporal partitioning and loop fission approach for FPGA based reconfigurable synthesis of DSP applications. In Proceedings of the IEEE/ACM Design Automation Conference (DAC'99). ACM, New York, 616--622.

Digital Library

[105]

Kaul, M. and Vemuri, R. 1998. Optimal temporal partitioning and synthesis for reconfigurable architectures. In Proceedings of the Conference on Design, Automation and Test in Europe (DATE'98). IEEE, Los Alamitos, CA, 389--396.

Digital Library

[106]

Khouri, K. S., Lakshminarayana, G., and Jha, N. K. 1999. Memory binding for performance optimization of control-flow intensive behaviors. In Proceedings of the IEEE/ACM International Conference on Computer-Aided Design (ICCAD'99). IEEE, Los Alamitos, CA, 482--488.

Digital Library

[107]

Kobayashi, S., Kozuka, I., Tang, W. H., and Landmann, D. 2004. A software/hardware codesigned hands-free system on a “resizable” block-floating-point DSP. In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP'04). IEEE, Los Alamitos, CA, 149--152.

[108]

Kress, R. 1996. A fast reconfigurable ALU for Xputers. Tech. rep. Kaiserlautern University, Kaiserlautern.

[109]

Krupnova, H. and Saucier, G. 1999. Hierarchical interactive approach to partition large designs into FPGAs. In Proceedings of the 9th International Workshop on Field-Programmable Logic and Applications (FPL'99). Lecture Notes in Computer Science, vol. 1673, Springer, Berlin, 101--110.

Digital Library

[110]

Kum, K.-I., Kang, J., and Sung, W. 2000. AUTOSCALER for C: An optimizing floating-point to integer C program converter for fixed-point digital signal processors. IEEE Trans. Circuits Syst. II, 47, 9, 840--848.

[111]

Lakshmikanthan, P., Govindarajan, S., Srinivasan, V., and Vemuri, R. 2000. Behavioral partitioning with synthesis for multi-FPGA architectures under interconnect, area, and latency constraints. In Proceedings of the 7th Reconfigurable Architectures Workshop (RAW'00). Lecture Notes in Computer Science, vol. 1800, Springer, Berlin, 924--931.

Digital Library

[112]

Lakshminarayana, G., Khouri, K. S., and Jha, N. K. 1997. Wavesched: A novel scheduling technique for control-flow intensive designs. In Proceedings of the 1997 IEEE/ACM International Conference on Computer-Aided Design. IEEE, Los Alamitos, CA, 244--250.

Digital Library

[113]

Lee, W., Barua, R., Frank, M., Srikrishna, D., Babb, J., Sarkar, V., and Amarasinghe, S. 1998. Space-time scheduling of instruction-level parallelism on a raw machine. In Proceedings of the ACM 8th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS'98). ACM, New York, 46--57.

Digital Library

[114]

Leiserson, C. E. and Saxe, J. B. 1991. Retiming synchronous circuitry. Algorithmica 6, 1, 5--35.

Digital Library

[115]

Leong, M. P., Yeung, M. Y., Yeung, C. K., Fu, C. W., Heng, P. A., and Leong, P. H. W. 1999. Automatic floating to fixed point translation and its application to post-rendering 3D warping. In Proceedings of the Seventh Annual IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM'99). IEEE, Los Alamitos, CA, 240--248.

Digital Library

[116]

Lewis, D. M., Ierssel, M. V., Rose, J., and Chow, P. 1998. The Transmogrifier-2: A 1 million gate rapid-prototyping system. IEEE Trans. VLSI Syst. 6, 2, 188--198.

Digital Library

[117]

Li, Y., Callahan, T., Darnell, E., Harr, R., Kurkure, U., and Stockwood, J. 2000. Hardware-software co-design of embedded reconfigurable architectures. In Proceedings of the IEEE/ACM Design Automation Conference (DAC'00). 507--512.

Digital Library

[118]

Liu, H. and Wong, D. F. 1999. Circuit partitioning for dynamically reconfigurable FPGAs. In Proceedings of the ACM 7th International Symposium on Field-Programmable Gate Arrays (FPGA'99). ACM, New York, 187--194.

Digital Library

[119]

Luk, W. and Wu, T. 1994. Towards a declarative framework for hardware-software codesign. In Proceedings of the 3rd International Workshop on Hardware/Software Codesign (CODES'94). IEEE, Los Alamitos, CA, 181--188.

Digital Library

[120]

Lysaght, P. and Rosenstiel, W. 2005. New Algorithms, Architectures and Applications for Reconfigurable Computing. Springer, Berlin.

Digital Library

[121]

Magenheimer, D. J., Peters, L., Pettis, K. W., and Zuras, D. 1988. Integer multiplication and division on the HP precision architecture. IEEE Trans. Computers 37, 8, 980--990.

Digital Library

[122]

Mahlke, S. A., Lin, D. C., Chen, W. Y., Hank, R. E., and Bringmann, R. A. 1992. Effective compiler support for predicated execution using the hyperblock. ACM SIGMICRO Newsl. 23, 1--2, 45--54.

Digital Library

[123]

Markovskiy, Y., Caspi, E., Huang, R., Yeh, J., Chu, M., Wawrzynek, J., and Dehon, A. 2002. Analysis of quasistatic scheduling techniques in a virtualized reconfigurable machine. In Proceedings of the ACM International Symposium on Field Programmable Gate Arrays (FPGA'02). ACM, New York, 196--205.

Digital Library

[124]

Maruyama, T. and Hoshino, T. 2000. A C to HDL compiler for pipeline processing on FPGAs. In Proceedings of the IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM'00). IEEE, Los Alamitos, CA, 101--110.

Digital Library

[125]

MathStar Inc. http://www.mathstar.com.

[126]

Mathworks. Home page: http://www.mathworks.com/.

[127]

Mei, B., Lambrechts, A., Verkest, D., Mignolet, J.-Y., and Lauwereins, R. 2005. Architecture exploration for a reconfigurable architecture template. IEEE Des. Test Comput. Mag. 22, 2, 90--101.

Digital Library

[128]

Mei, B., Vernalde, S., Verkest, D., Man, H. D., and Lauwereins, R. 2002. Dresc: A retargetable compiler for coarse-grained reconfigurable architectures. In Proceedings of the IEEE International Conference on Field-Programmable Technology (FPT'02). IEEE, Los Alamitos, CA, 166--173.

[129]

Mei, B., Vernalde, S., Verkest, D., Man, H. D., and Lauwereins, R. 2003. ADRES: An architecture with tightly coupled VLIW processor and coarse-grained reconfigurable matrix. In Proceedings of the International Conference on Field Programmable Logic and Application (FPL'03). Lecture Notes in Computer Science, vol. 2778, Springer, Berlin, 61--70.

[130]

Mencer, O., Platzner, M., Morf, M., and Flynn, M. J. 2001. Object-oriented domain-specific compilers for programming FPGAs. IEEE Trans. VLSI Syst. 9, 1, 205--210.

Digital Library

[131]

Micheli, G. D. 1994. Synthesis and Optimization of Digital Circuits. McGraw Hill, New York.

Digital Library

[132]

Micheli, G. D. and Gupta, R. 1997. Hardware/software co-design. Proc. IEEE 85, 3, 349--365.

[133]

Mirsky, E. and Dehon, A. 1996. MATRIX: A reconfigurable computing device with reconfigurable instruction deployable resources. In Proceedings of the IEEE Symposium on FPGAs for Custom Computing Machines (FCCM'96). IEEE, Los Alamitos, CA, 157--166.

[134]

Mitrionics A. B. http://www.mitrionics.com/.

[135]

Miyamori, T. and Olukotun, K. 1998. A quantitative analysis of reconfigurable coprocessors for multimedia applications. In Proceedings of the IEEE Symposium on FPGAs for Custom Computing Machines (FCCM'98). IEEE, Los Alamitos, CA, 2--11.

Digital Library

[136]

Moll, L., Vuillemin, J., and Boucard, P. 1995. High-energy physics on DECPeRLe-1 programmable active memory. In Proceedings of the ACM 3rd International Symposium on Field-Programmable Gate Arrays (FPGA'95). ACM, New York, 47--52.

Digital Library

[137]

Muchnick, S. S. 1997. Advanced Compiler Design and Implementation. Morgan Kaufmann, San Francisco, CA.

Digital Library

[138]

Nallatech Inc. http://www.nallatech.com.

[139]

Nayak, A., Haldar, M., Choudhary, A., and Banerjee, P. 2001a. Parallelization of Matlab applications for a multi-FPGA system. In Proceedings of the IEEE 9th Symposium on Field-Programmable Custom Computing Machines (FCCM'01). IEEE, Los Alamitos, CA, 1--9.

Digital Library

[140]

Nayak, A., Haldar, M., Choudhary, A., and Banerjee, P. 2001b. Precision and error analysis of MATLAB applications during automated hardware synthesis for FPGAs. In Proceedings of the Design, Automation and Test Conference in Europe (DATE'01). IEEE, Los Alamitos, CA, 722--728.

Digital Library

[141]

Nisbet, S. and Guccione, S. 1997. The XC6200DS development system. In Proceedings of the 7th International Workshop on Field-Programmable Logic and Applications (FPL'97). Lecture Notes in Computer Science, vol. 1304, Springer, Berlin, 61--68.

Digital Library

[142]

Ogawa, O., Takagi, K., Itoh, Y., Kimura, S., and Watanabe, K. 1999. Hardware synthesis from C programs with estimation of bit- length of variables. IEICE Trans. Fundamentals E82-A, 11, 2338--2346.

[143]

Ong, S.-W., Kerkiz, N., Srijanto, B., Tan, C., Langston, M., Newport, D., and Bouldin, D. 2001. Automatic mapping of multiple applications to multiple adaptive computing systems. In Proceedings of the 9th Annual IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM'01). IEEE, Los Alamitos, CA, 10--20.

Digital Library

[144]

Ouaiss, I. and Vemuri, R. 2001. Hierarchical memory mapping during synthesis in FPGA-based reconfigurable computers. In Proceedings of the Design, Automation and Test in Europe (DATE'01). IEEE, Los Alamitos, CA, 650--657.

Digital Library

[145]

Ouaiss, I. and Vemuri, R. 2000. Efficient resource arbitration in reconfigurable computing environments. In Proceedings of the Design, Automation and Test in Europe (DATE'00). 560--566.

Digital Library

[146]

Ouaiss, I., Govindarajan, S., Srinivasan, V., Kaul, M., and Vemuri, R. 1998a. An integrated partitioning and synthesis system for dynamically reconfigurable multi-FPGA architectures. In Proceedings of the 5th Reconfigurable Architectures Workshop (RAW'98). Lecture Notes in Computer Science, vol. 1388, Springer, Berlin, 31--36.

[147]

Ouaiss, I., Govindarajan, S., Srinivasan, V., Kaul, M., and Vemuri, R. 1998b. A unified specification model of concurrency and coordination for synthesis from VHDL. In Proceedings of the International Conference on Information Systems Analysis and Synthesis (ISAS'98). 771--778.

[148]

Page, I. 1996. Constructing hardware-software systems from a single description. J. VLSI Signal Process. 87--107.

[149]

Page, I. and Luk, W. 1991. Compiling Occam into FPGAs. In FPGAs, Abingdon EE&CS Books, Abingdon, UK, 271--283.

[150]

Pandey, A. and Vemuri, R. 1999. Combined temporal partitioning and scheduling for reconfigurable architectures. In Proceedings of the SPIE Photonics East Conference. 93--103.

[151]

Park, J. and Diniz, P. 2001. Synthesis of memory access controller for streamed data applications for FPGA-based computing engines. In Proceedings of the 14th International Symposium on System Synthesis (ISSS'01). 221--226.

Digital Library

[152]

Pellerin, D. and Thibault, S. 2005. Practical FPGA Programming in C. Prentice Hall, Englewood Cliffs, NJ.

Digital Library

[153]

Peterson, J. B., O'connor, R. B., and Athanas, P. 1996. Scheduling and partitioning ANSI-C programs onto multi-FPGA CCM architectures. In Proceedings of the 4th IEEE Symposium on Field Programmable Custom Computing Machines (FCCM'96). IEEE, Los Alamitos, CA, 178--179.

[154]

Purna, K. M. G. and Bhatia, D. 1999. Temporal partitioning and scheduling data flow graphs for reconfigurable computers. IEEE Trans. Computers 48, 6, 579--590.

Digital Library

[155]

Radetzki, M. 2000. Synthesis of digital circuits from object-oriented specifications. Tech rep. Oldenburg University, Oldenburg, Germany.

[156]

Raimbault, F., Lavenier, D., Rubini, S., and Pottier, B. 1993. Fine grain parallelism on an MIMD machine using FPGAs. In Proceedings of the IEEE Workshop FPGAs for Custom Computing Machines (FCCM'93). IEEE, Los Alamitos, CA, 2--8.

[157]

Rajan, J. V. and Thomas, D. E. 1985. Synthesis by delayed binding of decisions. In Proceedings of the 22nd IEEE Design Automation Conference (DAC'85). IEEE, Los Alamitos, CA, 367--373.

Digital Library

[158]

Ralev, K. R. and Bauer, P. H. 1999. Realization of block floating point digital filters and application to block implementations. IEEE Trans. Signal Process. 47, 4, 1076--1086.

Digital Library

[159]

Ramachandran, L., Gajski, D., and Chaiyakul, V. 1994. An algorithm for array variable clustering. In Proceedings of the European Design Test Conference (EDAC'94), IEEE, Los Alamitos, CA, 262--266.

[160]

Ramanujam, J. and Sadayappan, P. 1991. Compile-time techniques for data distribution in distributed memory machines. IEEE Trans. Parall. Distrib. Syst. 2, 4, 472--482.

Digital Library

[161]

Rau, B. R. 1994. Iterative module scheduling: An algorithm for software pipelining loops. In Proceedings of the ACM 27th Annual International Symposium on Microarchitecture (MICRO-27). ACM, New York, 63--74.

Digital Library

[162]

Razdan, R. 1994. PRISC: Programmable reduced instruction set computers. Tech. rep. Division of Applied Sciences, Harvard University, Cambridge, MA.

[163]

Razdan, R. and Smith, M. D. 1994. A high-performance microarchitecture with hardware-programmable functional units. In Proceedings of the 27th IEEE/ACM Annual International Symposium on Microarchitecture (MICRO-27). 172--180.

Digital Library

[164]

Rinker, R., Carter, M., Patel, A., Chawathe, M., Ross, C., Hammes, J., Najjar, W., and Böhm, A. P. W. 2001. An automated process for compiling dataflow graphs into hardware. IEEE Trans. VLSI Syst. 9, 1, 130--139.

Digital Library

[165]

Rivera, G. and Tseng, C.-W. 1998. Data transformations for eliminating cache misses. In Proceedings of the ACM Conference on Programming Language Design and Implementation (PLDI'98). ACM, New York, 38--49.

Digital Library

[166]

Rupp, C. R., Landguth, M., Garverick, T., Gomersall, E., Holt, H., Arnold, J. M., and Gokhale, M. 1998. The NAPA adaptive processing architecture. In Proceedings of the IEEE 6th Symposium on Field-Programmable Custom Computing Machines (FCCM'98). IEEE, Los Alamitos, CA, 28--37.

Digital Library

[167]

Salefski, B. and Caglar, L. 2001. Re-configurable computing in wireless. In Proceedings of the 38th Annual ACM IEEE Design Automation Conference (DAC'01). 178--183.

Digital Library

[168]

Santos, L. C. V. D., Heijligers, M. J. M., Eijk, C. A. J. V., Eijnhoven, J. V., and Jess, J. A. G. 2000. A code-motion pruning technique for global scheduling. ACM Trans. Des. Autom. Electron. Syst. 5, 1, 1--38.

Digital Library

[169]

Schmit, H. and Thomas, D. 1998. Address generation for memories containing multiple arrays. IEEE Trans. Comput.-Aid. Des. Integrat. Circuits Syst. 17, 5, 377--385.

Digital Library

[170]

Schmit, H. and Thomas, D. 1997. Synthesis of applications-specific memory designs. IEEE Trans. VLSI Syst. 5, 1, 101--111.

Digital Library

[171]

Schmit, H., Arnstein, L., Thomas, D., and Lagnese, E. 1994. Behavioral synthesis for FPGA-based computing. In Proceedings of the IEEE Workshop on FPGAs for Custom Computing Machines (FCCM'94). IEEE, Los Alamitos, CA, 125--132.

[172]

Séméria, L., Sato, K., and Micheli, G. D. 2001. Synthesis of hardware models in C with pointers and complex data structures. IEEE Trans. VLSI Syst. 9, 6, 743--756.

Digital Library

[173]

Sharp, R. and Mycroft, A. 2001. A higher level language for hardware synthesis. In Proceedings of the 11th IFIP WG 10.5 Advanced Research Working Conference on Correct Hardware Design and Verification Methods (CHARME'01). Lecture Notes in Computer Science, vol. 2144, Springer, Berlin, 228--243.

Digital Library

[174]

Shirazi, N., Walters, A., and Athanas, P. 1995. Quantitative analysis of floating point arithmetic on FPGA-based custom computing machines. In Proceedings of the IEEE Symposium on FPGA's for Custom Computing Machines (FCCM). IEEE, Los Alamitos, CA, 155--162.

Digital Library

[175]

Singh, H., Lee, M.-H., Lu, G., Bagherzadeh, N., Kurdahi, F. J., and Filho, E. M. C. 2000. MorphoSys: An integrated reconfigurable system for data-parallel and computation-intensive applications. IEEE Trans. Computers 49, 5, 465--481.

Digital Library

[176]

Snider, G. 2002. Performance-constrained pipelining of software loops onto reconfigurable hardware. In Proceedings of the ACM 10th International Symposium on Field-Programmable Gate Arrays (FPGA'02). ACM, New York, 177--186.

Digital Library

[177]

Snider, G., Shackleford, B., and Carter, R. J. 2001. Attacking the semantic gap between application programming languages and configurable hardware. In Proceedings of the ACM 9th International Symposium on Field-Programmable Gate Arrays (FPGA'01). ACM, New York, 115--124.

Digital Library

[178]

So, B., Hall, M., and Ziegler, H. 2004. Custom data layout for memory parallelism. In Proceedings of the International Symposium on Code Generation and Optimization (CGO'04). IEEE, Los Alamitos, CA, 291--302.

Digital Library

[179]

So, B. and Hall, M. W. 2004. Increasing the applicability of scalar replacement. In Proceedings of the ACM Symposium on Compiler Construction (CC'04). Lecture Notes in Computer Science, vol. 2985, Springer, Berlin,185--201.

[180]

SRC Computers Inc. http://www.srccomp.com/.

[181]

Starbridge-Systems Inc. http://www.starbridgesystems.com.

[182]

Stefanovic, D. and Martonosi, M. 2000. On availability of bit-narrow operations in general-purpose applications. In Proceedings of the 10th International Conference on Field-Programmable Logic and Applications (FPL'00). Lecture Notes in Computer Science, vol. 1896, Springer, Berlin, 412--421.

Digital Library

[183]

Stephenson, M., Babb, J., and Amarasinghe, S. 2000. Bidwidth analysis with application to silicon compilation. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI'00). ACM, New York, 108--120.

Digital Library

[184]

Stretch Inc. http://www.stretchinc.com/.

[185]

Sutter, B. D., Mei, B., Bartic, A., Aa, T. V., Berekovic, M., Mignolet, J.-Y., Croes, K., Coene, P., Cupac, M., Couvreur, A., Folens, A., Dupont, S., Thielen, B. V., Kanstein, A., Kim, H.-S., and Kim, S. J. 2006. Hardware and a tool chain for ADRES. In Proceedings of the International Workshop on Applied Reconfigurable Computing (ARC'06). Lecture Notes in Computer Science, vol. 3985, Springer, Berlin, 425--430.

[186]

Synopsys Inc. 2000. Cocentric fixed-point designer. http://www.synopsys.com/.

[187]

Synplicity Inc. http://www.synplicity.com/.

[188]

Takayama, A., Shibata, Y., Iwai, K., and Amano, H. 2000. Dataflow partitioning and scheduling algorithms for WASMII, a virtual hardware. In Proceedings of the 10th International Workshop on Field-Programmable Logic and Applications (FPL'00). Lecture Notes in Computer Science, vol. 1896, Springer, Berlin, 685--694.

Digital Library

[189]

Taylor, M. B., Kim, J., Miller, J., Wentzlaff, D., Ghodrat, F., Greenwald, B., Hoffman, H., Johnson, P., Lee, J.-W., Lee, W., Ma, A., Saraf, A., Seneski, M., Shnidman, N., Strumpen, V., Frank, M., Amarasinghe, S., and Agarwal, A. 2002. The Raw microprocessor: A computational fabric for software circuits and general-purpose programs. IEEE Micro 22, 2, 25--35.

Digital Library

[190]

Tensilica Inc. http://www.tensilica.com/.

[191]

Tessier, R. and Burleson, W. 2001. Reconfigurable computing for digital signal processing: A survey. J. VLSI Signal Process. 28, 1--2, 7--27.

Digital Library

[192]

Todman, T., Constantinides, G., Wilton, S., Cheung, P., Luk, W., and Mencer, O. 2005. Reconfigurable computing: architectures and design methods. IEE Proc. (Comput. Digital Techniques) 152, 2, 193--207.

[193]

Trimberger, S. 1998. Scheduling designs into a time-multiplexed FPGA. In Proceedings of the ACM 6th International Symposium on Field-Programmable Gate Arrays (FPGA'98). ACM, New York, 153--160.

Digital Library

[194]

Tripp, J. L., Jackson, P. A., and Hutchings, B. 2002. Sea Cucumber: A synthesizing compiler for FPGAs. In Proceeedings of the 12th International Conference on Field-Programmable Logic and Applications (FPL'02). Lecture Notes in Computer Science, vol. 2438, Springer, Berlin, 875--885.

Digital Library

[195]

Tripp, J. L., Peterson, K. D., Ahrens, C., Poznanovic, J. D., and Gokhale, M. 2005. Trident: An FPGA compiler framework for floating-point algorithms. In Proceedings of the International Conference on Field Programmable Logic and Applications (FPL'05). IEEE, Los Alamitos, CA, 317--322.

[196]

Triscend Corp. 2000. Triscend A7 CSoC family.

[197]

Vahid, F. 1995. Procedure exlining: A transformation for improved system and behavioral synthesis. In Proceedings of the 8th International Symposium on System Synthesis (ISSS'95). ACM, New York, 84--89.

Digital Library

[198]

Vahid, F., Le, T. D., and Hsu, Y.-C. 1998. Functional partitioning improvements over structural partitioning for packaging constraints and synthesis: tool performance. ACM Trans. Des. Autom. Electron. Syst. 3, 2, 181--208.

Digital Library

[199]

Vasilko, M. and Ait-Boudaoud, D. 1996. Architectural synthesis techniques for dynamically reconfigurable logic. In Proceedings of the 6th International Workshop on Field-Programmable Logic and Applications (FPL'96). Lecture Notes in Computer Science, vol. 1142, Springer, Berlin, 290--296.

Digital Library

[200]

Waingold, E., Taylor, M., Srikrishna, D., Sarkar, V., Lee, W., Lee, V., Kim, J., Frank, M., Finch, P., Barua, R., Babb, J., Amarasinghe, S., and Agarwal, A. 1997. Baring it all to software: Raw machines. Computer 30, 9, 86--93.

Digital Library

[201]

Weinhardt, M. And Luk, W. 2001a. Memory access optimisation for reconfigurable systems. IEE Proc. (Comput. Digital Techniques) 148, 3, 105--112.

[202]

Weinhardt, M. and Luk, W. 2001b. Pipeline vectorization. IEEE Trans. Comput.-Aid. Des. Integrat. Circuits Syst. 20, 2, 234--233.

Digital Library

[203]

Willems, M., Bürsgens, V., Keding, H., Grötker, T., and Meyr, H. 1997. System level fixed-point design based on an interpolative approach. In Proceedings of the IEEE/ACM 37th Design Automation Conference (DAC'97). ACM, New York, 293--298.

Digital Library

[204]

Wilson, R., French, R., Wilson, C., Amarasinghe, S., Anderson, J., Tjiang, S., Liao, S., Tseng, C., Hall, M., Lam, M., and Hennessy, J. 1994. SUIF: An infrastructure for research on parallelizing and optimizing compilers. ACM SIGPLAN Not. 29, 12, 31--37.

Digital Library

[205]

Wirth, N. 1998. Hardware compilation: Translating programs into circuits. Computer 31, 6, 25--31.

Digital Library

[206]

Wirthlin, M. J. 1995. A dynamic instruction set computer. In Proceedings of the IEEE Symposium on FPGAs for Custom Computing Machines (FCCM'95). IEEE, Los Alamitos, CA, 99--107.

Digital Library

[207]

Wirthlin, M. J., Hutchings, B., and Worth, C. 2001. Synthesizing RTL hardware from Java byte codes. In Proceedings of the 11th International Conference on Field Programmable Logic and Applications (FPL'01). Lecture Notes in Computer Science, vol. 2147, Springer, Berlin, 123--132.

Digital Library

[208]

Witting, R. and Chow, P. 1996. OneChip: An FPGA processor with reconfigurable logic. In Proceedings of the IEEE Symposium on FPGAs for Custom Computing Machines (FCCM'96). IEEE, Los Alamitos, CA, 126--135.

[209]

Wo, D. and Forward, K. 1994. Compiling to the gate level for a reconfigurable coprocessor. In Proceedings of the 2nd IEEE Workshop on FPGAs for Custom Computing Machines (FCCM'94). IEEE, Los Alamitos, CA, 147--154.

[210]

Wolfe, M. J. 1995. High Performance Compilers for Parallel Computing. Addison-Wesley, Reading, MA.

Digital Library

[211]

Xilinx Inc. http://www.xilinx.com/.

[212]

Xilinx Inc. 2001. Virtex-II 1.5V, field-programmable gate arrays (v1.7). http://www.xilinx.com.

[213]

XPP. XPP: The eXtreme processor platform, PACT home page. http://www.pactxpp.com, PACT XPP Technologies AG, Munich.

[214]

Ye, Z., Shenoy, N., and Banerjee, P. 2000a. A C Compiler for a Processor with a Reconfigurable Functional Unit. In Proceedings of the ACM Symposium on Field Programmable Gate Arrays (FPGA'2000). ACM, New York, 95--100.

Digital Library

[215]

Ye, Z. A., Moshovos, A., Hauck, S., and Banerjee, P. 2000b. Chimaera: A high-performance architecture with a tightly-coupled reconfigurable functional unit. In Proceedings of the 27th Annual International Symposium on Computer Architecture (ISCA'00). ACM, New York, 225--235.

Digital Library

[216]

Zhang, X. and Ng, K. W. 2000. A review of high-level synthesis for dynamically reconfigurable FPGAs. Microprocess.Microsys. 24, 4, 199--211.

[217]

Ziegler, H., Malusare, P., and Diniz, P. 2005. Array replication to increase parallelism in applications mapped to configurable architectures. In Proceedings of the 18th International Workshop on Languages and Compilers for Parallel Computing (LCPC'05). Lecture Notes in Computer Science, vol. 4339, Springer, Berlin, 63--72.

Digital Library

[218]

Ziegler, H., So, B., Hall, M., and Diniz, P. 2002. Coarse-grain pipelining on multiple FPGA architectures. In Proceedings of the 10th IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM'02). IEEE, Los Alamitos, CA, 77--86.

Digital Library

Cited By

Sunny CDas SMartin KCoussy P(2024)CREPE: Concurrent Reverse-Modulo-Scheduling and Placement for CGRAsIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2024.340209835:7(1293-1306)Online publication date: 16-May-2024
https://dl.acm.org/doi/10.1109/TPDS.2024.3402098
Li YZhu JFu YLei YNagata TBraidwood RFu HZheng JLuk WFan H(2024)Circular Reconfigurable Parallel Processor for Edge Computing : Industrial Product ✶2024 ACM/IEEE 51st Annual International Symposium on Computer Architecture (ISCA)10.1109/ISCA59077.2024.00067(863-875)Online publication date: 29-Jun-2024
https://doi.org/10.1109/ISCA59077.2024.00067
Chen NCheng FHan CJiang JWen X(2023)Loop Subgraph-Level Greedy Mapping Algorithm for Grid Coarse-Grained Reconfigurable ArrayTsinghua Science and Technology10.26599/TST.2022.901000128:2(330-343)Online publication date: Apr-2023
https://doi.org/10.26599/TST.2022.9010001
Show More Cited By

Index Terms

Compiling for reconfigurable computing: A survey

Recommendations

Input data reuse in compiling window operations onto reconfigurable hardware
LCTES '04

Balancing computation with I/O has been considered as a critical factor of the overall performance for embedded systems in general and reconfigurable computing systems in particular. Data I/O often dominates the overall computation performance for ...
Input data reuse in compiling window operations onto reconfigurable hardware
LCTES '04: Proceedings of the 2004 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools for embedded systems

Balancing computation with I/O has been considered as a critical factor of the overall performance for embedded systems in general and reconfigurable computing systems in particular. Data I/O often dominates the overall computation performance for ...
Design Assurance Strategy and Toolset for Partially Reconfigurable FPGA Systems

The growth of the Reconfigurable Computing (RC) systems community exposes diverse requirements with regard to functionality of Electronic Design Automation (EDA) tools. Low-level design tools are increasingly required for RC bitstream debugging and IP ...

Comments

Information & Contributors

Information

Published In

cover image ACM Computing Surveys

ACM Computing Surveys Volume 42, Issue 4

June 2010

175 pages

ISSN:0360-0300

EISSN:1557-7341

DOI:10.1145/1749603

Issue’s Table of Contents

Copyright © 2010 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 23 June 2010

Accepted: 01 December 2008

Revised: 01 June 2008

Received: 01 November 2004

Published in CSUR Volume 42, Issue 4

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed

Funding Sources

Fundação para a Ciência e a Tecnologia

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

79
Total Citations
View Citations
4,879
Total Downloads

Downloads (Last 12 months)69
Downloads (Last 6 weeks)3

Reflects downloads up to 16 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Sunny CDas SMartin KCoussy P(2024)CREPE: Concurrent Reverse-Modulo-Scheduling and Placement for CGRAsIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2024.340209835:7(1293-1306)Online publication date: 16-May-2024
https://dl.acm.org/doi/10.1109/TPDS.2024.3402098
Li YZhu JFu YLei YNagata TBraidwood RFu HZheng JLuk WFan H(2024)Circular Reconfigurable Parallel Processor for Edge Computing : Industrial Product ✶2024 ACM/IEEE 51st Annual International Symposium on Computer Architecture (ISCA)10.1109/ISCA59077.2024.00067(863-875)Online publication date: 29-Jun-2024
https://doi.org/10.1109/ISCA59077.2024.00067
Chen NCheng FHan CJiang JWen X(2023)Loop Subgraph-Level Greedy Mapping Algorithm for Grid Coarse-Grained Reconfigurable ArrayTsinghua Science and Technology10.26599/TST.2022.901000128:2(330-343)Online publication date: Apr-2023
https://doi.org/10.26599/TST.2022.9010001
Böseler FWalter J(2023)A Flexible Graph Language for a Model-Based Semi-Automatic CGRA Compilation Flow2023 Forum on Specification & Design Languages (FDL)10.1109/FDL59689.2023.10272184(1-8)Online publication date: 13-Sep-2023
https://doi.org/10.1109/FDL59689.2023.10272184
Kaiser TGerfers F(2023) A 2.41-μW/MHz, 437-PE/mm 2 CGRA in 22 nm FD-SOI With RISC-Like Code Generation 2023 IEEE Symposium in Low-Power and High-Speed Chips (COOL CHIPS)10.1109/COOLCHIPS57690.2023.10121985(1-6)Online publication date: 19-Apr-2023
https://doi.org/10.1109/COOLCHIPS57690.2023.10121985
Fryer JGarcia P(2023)The Good, the Bad and the Ugly: Practices and Perspectives on Hardware Acceleration for Embedded Image ProcessingJournal of Signal Processing Systems10.1007/s11265-023-01885-595:10(1181-1201)Online publication date: 1-Oct-2023
https://dl.acm.org/doi/10.1007/s11265-023-01885-5
Sylvestre LChailloux ESérot J(2023)Accelerating OCaml Programs on FPGAInternational Journal of Parallel Programming10.1007/s10766-022-00748-z51:2-3(186-207)Online publication date: 24-Jan-2023
https://dl.acm.org/doi/10.1007/s10766-022-00748-z
Khurge D(2023)Strategic Infrastructural Developments to Reinforce Reconfigurable Computing for Indigenous AI ApplicationsArtificial Intelligence Applications and Reconfigurable Architectures10.1002/9781119857891.ch1(1-24)Online publication date: 10-Feb-2023
https://doi.org/10.1002/9781119857891.ch1
Sozzo EConficconi DZeni ASalaris MSciuto DSantambrogio M(2022)Pushing the Level of Abstraction of Digital System Design: A Survey on How to Program FPGAsACM Computing Surveys10.1145/353298955:5(1-48)Online publication date: 3-Dec-2022
https://dl.acm.org/doi/10.1145/3532989
Martin K(2022)Twenty Years of Automated Methods for Mapping Applications on CGRA2022 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)10.1109/IPDPSW55747.2022.00118(679-686)Online publication date: May-2022
https://doi.org/10.1109/IPDPSW55747.2022.00118
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Issue’s Table of Contents