ABSTRACT
Retiming and slowdown are algorithms that can be used to pipeline synchronous circuits. Iterative modulo scheduling is an algorithm for software pipelining in the presence of resource constraints. Integrating the best features of both yields a pipelining algorithm, retimed modulo scheduling, that can more effectively exploit the idiosyncrasies of reconfigurable hardware. It also fits naturally into a design space exploration process to trade-off speed for power, energy or area.
- 1.C. Leiserson, J. Saxe, "Retiming Synchronous Systems," Algorithmica, 6(1), 1991.Google Scholar
- 2.H. Touati, R. Brayton, "Computing the Initial States of Retimed Circuits," IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 12, no. 1, January 1993.Google ScholarDigital Library
- 3.K. Eckl, J. Madre, P. Zepter, C. Legl, "A Practical Approach to Multiple-Class Retiming," Proceedings of the 36th ACM/ IEEE Conference on Design Automation, 1999. Google ScholarDigital Library
- 4.V.Singhal,S.Malik,R.Brayton,"The Case forRetiming with Explicit Reset Circuitry," International Conference on Computer-Aided Design, 1996. Google ScholarDigital Library
- 5.B. Rau, "Iterative Modulo Scheduling," HP Labs Technical Report HPL-94-115.Google Scholar
- 6.M. Papaefthymiou, "Understanding Retiming through Maximum Average-Weight Cycles," Proceedings of the Third Annual ACM Symposium on Parallel Algorithms and Architectures, 1991. Google ScholarDigital Library
- 7.S. Kundu, L. Huisman, I. Nair, V. Iyengar, "A Small Test Generator for Large Designs," International Test Conference, 1992. Google ScholarDigital Library
- 8.C. Leiserson, J. Saxe, "Optimizing Synchronous Systems," Journal of VLSI and Computer Systems,vol.1,no1,1983.Google Scholar
- 9.C. Leiserson, "Systolic and Semisystolic Design," IEEE International Conference on Computer Design / VLSI in Computers, 1983.Google Scholar
- 10.N. Shenoy, R. Rudell, "Efficient Implementation of Retiming," 1994 IEEE/ACM International Conference on Computer- aided Design. Google ScholarDigital Library
- 11.P. Pan, G. Chen, "Optimal Retiming for Initial State Computation," 12th International Conference on VLSI Design,January 1999. Google ScholarDigital Library
- 12.M. Wolfe, M. Lam, "A Loop Transformation Theory and Algorithm to Maximize Parallelism," IEEE Transactions on Parallel and Distributed Systems, vol. 2, no. 4, October 1991. Google ScholarDigital Library
- 13.M. J. Wolfe, "More Iteration Space Tiling," Proceedings of Supercomputing '89, 1989. Google ScholarDigital Library
- 14.S. Hassoun, C. Ebeling, "Architectural Retiming: Pipelining Latency-Constrained Circuits," 33rd Design Automation Conference, 1996. Google ScholarDigital Library
- 15.D. Maydan, J. Hennessy, M. Lam, "Efficient and Exact Data Dependence Analysis," Proceedings of the ACM SIGPLAN '91 Conference on Programming Language Design and Implementation, 1991. Google ScholarDigital Library
- 16.S. Mahlke, "Exploiting Instruction-level Parallelism in the Presence of Conditional Branches," Ph.D. dissertation, University of Illinois, Sept. 1996. Google ScholarDigital Library
- 17.J. Tiernan, "An Efficient Search Algorithm to Find the Elementary Circuits of a Graph," Communications of the ACM, vol. 13, no. 12, December 1970. Google ScholarDigital Library
- 18.T. Callahan, J. Wawrzynek, "Adapting Software Pipelining for Reconfigurable Computing," Proceedings of the International Conference on Compilers, Architectures, and Synthesis for Embedded Systems, 2000. Google ScholarDigital Library
- 19.M. Gokhale, J. Stone, E. Gomersall, "Co-synthesis to a Hybrid RISC/FPGA Architecture," Journal of VLSI Signal Processing Systems for Signal, Image and Video Technology, vol. 24, no. 2, March 2000. Google ScholarDigital Library
- 20.R.Schreiber,S.Aditya,B.Rau,V.Kathail,S.Mahlke,S. Abraham, G. Snider, "High-Level Synthesis of Nonprogrammable ardware Accelerators," HP Labs Technical Report HPL-2000-31.Google Scholar
- 21.V. Srinivasan, R. Vemuri, "A Retiming Based Relaxation Heuristic for Resource-Constrained Loop Pipelining," Proceedings of the Eleventh International Conference on VLSI Design: VLSI for Signal Processing, 1998. Google ScholarDigital Library
- 22.P. Calland, A. Darte, Y. Robert, "Circuit Retiming Applied to Decomposed Software Pipelining," IEEE Transactions on parallel and Distributed Systems, vol. 9, no. 1, January 1998. Google ScholarDigital Library
- 23.M. Weinhardt, W. Luk, "Pipeline Vectorization," IEEE Transactions on Computer-Aided Designs of Integrated Circuits and Systems, vol. 20, no.2, February 2001. Google ScholarDigital Library
- 24.T. O'Neil, S. Tongsima, E. Sha, "Optimal Scheduling of Data- Flow Graphs Using Extended Retiming," Proceedings of the ISCA 12th International Conference on Parallel and Distributed Computing Systems, 1999.Google Scholar
- 25.J.Monteiro,S.Devadas,P.Ashar,A.Mauskar,"Scheduling Techniques to Enable Power Management," 33rd Design Automation Conference, 1996. Google ScholarDigital Library
- 26.H. Yun, J. Kim, "Power-Aware Modulo Scheduling for High- Performance VLIWProcessors," International Symposium on Low Power Electronics and Design, 2001. Google ScholarDigital Library
- 27.E. Musoll, J. Cortadella, "Scheduling and Resource Binding for Low Power," Proceedings of the International Symposium on System Synthesis, 1995. Google ScholarDigital Library
- Performance-constrained pipelining of software loops onto reconfigurable hardware
Recommendations
Software Pipelining of Nested Loops
CC '01: Proceedings of the 10th International Conference on Compiler ConstructionSoftware pipelining is a technique to improve the performance of a loop by overlapping the execution of several iterations. The execution of a software-pipelined loop goes through three phases: prolog, kernel, and epilog. Software pipelining works best ...
Single-dimension software pipelining for multidimensional loops
Traditionally, software pipelining is applied either to the innermost loop of a given loop nest or from the innermost loop to outer loops. This paper proposes a three-step approach, called single-dimension software pipelining (SSP), to software pipeline ...
Single-Dimension Software Pipelining for Multi-Dimensional Loops
CGO '04: Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimizationTraditionally, software pipelining is applied either to theinnermost loop of a given loop nest or from the innermostloop to outer loops. In this paper, we propose a three-stepapproach, called Single-dimension Software Pipelining(SSP), to software ...
Comments