Skip to main content
Log in

A coarse-grained reconfigurable computing architecture with loop self-pipelining

  • Published:
Science in China Series F: Information Sciences Aims and scope Submit manuscript

Abstract

Reconfigurable computing tries to achieve the balance between high efficiency of custom computing and flexibility of general-purpose computing. This paper presents the implementation techniques in LEAP, a coarse-grained reconfigurable array, and proposes a speculative execution mechanism for dynamic loop scheduling with the goal of one iteration per cycle and implementation techniques to support decoupling synchronization between the token generator and the collector. This paper also introduces the techniques of exploiting both data dependences of intra- and inter-iteration, with the help of two instructions for special data reuses in the loop-carried dependences. The experimental results show that the number of memory accesses reaches on average 3% of an RISC processor simulator with no memory optimization. In a practical image matching application, LEAP architecture achieves about 34 times of speedup in execution cycles, compared with general-purpose processors.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Cardoso J M P. Dynamic loop pipelining in data-driven architectures. In: Proceedings of the 2nd conference on Computing frontiers. New York: ACM, 2005. 106–115

    Chapter  Google Scholar 

  2. Baumgarte V, Ehlers G, May F, et al. PACT XPP - a selfreconfigurable data processing architecture. J Supercomput, 2003, 26(2): 167–184

    Article  MATH  Google Scholar 

  3. Mei B, Vernalde S, Verkest D, et al. Exploiting loop-level parallelism on coarse-grained reconfigurable architectures using modulo scheduling. In: Proceedings of Design, Automation, and Test in Europe. Washington: IEEE, 2003. 10296–10301

    Google Scholar 

  4. Barat F, Jayapala M, Op de Beeck P. Software pipelining for coarse-grained reconfigurable instruction set processors. In: Pr-oceedings of the 2002 Conference on Asia South Pacific Design Automation/VLSI Design. Washington: IEEE, 2002. 338–344

    Google Scholar 

  5. Hauser J R, Wawrzynek J. Garp: a MIPS processor with a reconfigurable coprocessor. In: Proceedings of the 5th Annual IEEE Symposium on FPGAs for Custom Computing Machines. Washington: IEEE, 1997. 16–21

    Google Scholar 

  6. Rau B, Iterative modulo scheduling: an algorithm for software pipelining loops. In: Proceedings of the ACM MICRO-27. New York: ACM. 63–74

  7. Lee M H, Singh H, Lu G, et al. Design and implementation of the MorphoSys reconfigurable computing processor. J VLSI Signal Process Syst Signal Image Video Technol, 2000, 24: 147–164

    Google Scholar 

  8. Dennis J B and Gao G R. An efficient pipelined dataflow processor architecture. In: Proceedings of Supercomputing. Los Alamitos: IEEE, 1988. 363–373

    Google Scholar 

  9. Arvind, Nikhil R S. Executing a program on the MIT taggedtoken dataflow architecture. IEEE Trans Comput, 1990, 39(3): 300–318

    Article  Google Scholar 

  10. Iannucci R A. Toward a dataflow/von Neumann hybrid architecture. In: Proceedings of ISCA-15, 1998. 131–140

  11. Kahn G. The semantics of a simple language for parallel programming. In: Proceedings of the IFIP Congress, 1974. 471–475

  12. Budiu M. Spatial Computation. CMU CS Technical Report, CMU-CS-03-217, 2003

  13. Carr S, Kennedy K. Scalar replacement in the presence of conditional control flow. Softw Pract Exper, 1994, 24(1): 51–77

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to GuiMing Wu.

Additional information

Supported by the National Natural Science Foundation of China (Grant No. 60633050, 60621003) and the National High Technology Research and Development Program of China (Grant No. 2007AA01Z06)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Dou, Y., Wu, G., Xu, J. et al. A coarse-grained reconfigurable computing architecture with loop self-pipelining. Sci. China Ser. F-Inf. Sci. 52, 575–587 (2009). https://doi.org/10.1007/s11432-008-0146-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11432-008-0146-6

Keywords

Navigation