Optimizing throughput and resource utilization using pipelining: Transformation based approach

Potkonjak, Miodrag; Rabaey, Jan

doi:10.1007/BF02109380

Miodrag Potkonjak¹ &
Jan Rabaey²

57 Accesses
3 Citations
3 Altmetric
Explore all metrics

Abstract

A simple formulation of pipelining: “Pipelining withN stages is equivalent to retiming where the number of delays on all inputs or all outputs, but not both, is increased byN” is used as the basis for a convenient and efficient treatment of pipelining in the design of application specific computers.

Pipelining according to the objective function (throughput or resource utilization) and the latency is introduced. For two polynomial complexity pipelining classes, optimal algorithms are presented. For two other classes both proofs of NP-completeness and efficient probabilistic algorithms are presented. Both theoretical and experimental properties of pipelining are discussed and a relationship with other transformations is explored. Due to similar formulations for both software pipelining and the pipelining presented here, all results can be easily modified for use in compilers for general purpose computers. We have also developed a polynomial complexity algorithm for determining the iteration bound.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Quantum computing

Article Open access 05 August 2022

Formal method for the synthesis of optimal topologies of computing systems based on the projective description of graphs

Article 26 March 2022

Can GPU performance increase faster than the code error rate?

Article Open access 18 April 2024

References

P.M. Kogge,The Architecture of Pipelined Computers, Washington: Hemisphere Pub. Corp.; New York: McGraw-Hill, 1981.
MATH Google Scholar
J.L. Hennessy and D.A. Patterson,Computer Architecture: A Quantitative Approach, San Mateo, CA.: Morgan Kaufman Publishers, 1989.
Google Scholar
H.S. Stone,High-performance Computer Architecture, Boston, MA: Addison Wesley, 1990.
Google Scholar
K. Hwang and F.A. Briggs,Computer Architecture and Parallel Processing, New York, NY: McGraw-Hill, 1984.
MATH Google Scholar
J. Rabaey, C. Chu, P. Hoang, and M. Potkonjak, “Fast Prototyping of Data Path Intensive Architecture,”IEEE Design and Test, Vol. 8, pp. 40–51, 1991.
Article Google Scholar
A.E. Charlesworth, “An Approach to Scientific Array Processing: The Architectural Design of the AP-120B/FPS-164 Family,”IEEE Computer, Vol. 14, pp. 18–27, 1981.
Article Google Scholar
B.R. Rau, C.D. Glasser, and R.L. Pickard, “Efficient Code Generation for Horizontal Architectures: Compiler Techniques and Architectural Support,”Proc. 9th Intl. Symposium on Computer Architecture, pp. 131–134, 1982.
A. Aiken and A. Nicolau, “Perfect Pipelining: A new loop parallelisation technique,”Proc. 1988 European Symp. on Programming, pp. 221–235, 1988.
M.S. Lam, “Software Pipelining: An Effective Scheduling Technique for VLIW Machines,”ACM SIGPLAN, pp. 318–328, 1988.
K. Ebcioglu, “A Compilation Technique for Software Pipelining of Loops with Conditional Jump,”IEEE-MICRO-20, pp. 69–79, Dec. 1987.
K. Ebcioglu and A. Nicolau, “A global resourceconstrained parallelisation technique,”Proc. ACM SIGARCH ICS-89: Int. Conf. on Supercomputing, pp. 154–163, 1989.
G. Goossens, J. Wandewalle, and H. De Man, “Loop optimization in register-transfer scheduling for DSP-systems,”26th Design Automation Conference, pp. 826–831, Las Vegas, NV, 1989.
C.Y.R. Chen and M.Z. Moricz, “Data Path Scheduling for Two-Level Pipelining,”28th ACM/IEEE Design Automation Conference, pp. 603–606, 1991.
N. Park and A.C. Parker, “Sehwa: A Software Package for Synthesis of Pipelines from Behavioral Specifications,”IEEE Trans. on CAD, Vol. 7, pp. 356–370, 1988.
Article Google Scholar
N. Park and A.C. Parker, “Theory of Clocking for Maximum Execution Overlap of High-speed Digital Systems,”IEEE Trans. on Computers, Vol. 37, pp. 678–690, 1988.
Article Google Scholar
R. Jain, “High-Level Area-Delay Prediction with Application to Behavioral Synthesis,”Technical Report 89-23, University of Southern California, 1989.
M.J. Mlinar, “Control Path/Data Path Trade-offs in VLSI Design,”Technical Report 91-16, University of Southern California, 1991.
K.N. McNall and A.E. Casavant, “Automatic Operator Configuration in the Synthesis of Pipelined Architectures,”27th ACM/IEEE Design Automation Conference, pp. 174–179, 1990.
C.-T. Hwang, J.-H. Lee, and Y.-C. Hsu, “A Formal Approach to the scheduling problem in high level synthesis,”IEEE Trans. on CAD, Vol. 10, pp. 464–475, 1991.
Article Google Scholar
J.J. Kim, F.J. Kurdahi, and N. Park, “Automatic-Synthesis of Time-Stationary Controllers for Pipelined Data Paths,”IEEE International Conference on CAD, Santa Clara, CA, pp. 30–33, 1991.
R.A. Walker and R. Camposano,A Survey of High-Level Synthesis Systems, Boston, MA: Kluwer, 1990.
Google Scholar
K.K. Parhi, C.Y. Wang, and A.P. Brown, “Synthesis of control circuits in folded pipelined DSP architectures,”IEEE Journal of Solid State Circuits, Vol. 27, pp. 29–43, 1992.
Article Google Scholar
B. Gold and K.L. Jordan, “A Note on Digital Filter Synthesis,”Proc. of IEEE, pp. 1717–1718, 1968.
D. Chanoux, “A method of Digital Filter Synthesis,”M.S. Thesis, MIT, Cambridge, MA, May 1969.
Google Scholar
C.S. Burrus, “Block Implementation of Digital Filters,”IEEE Trans. on Circuits Theory, Vol. 18, pp. 697–701, 1971.
Article Google Scholar
T. Meng and D.G. Messerschmitt, “Arbitrarily high sampling rate adaptive filters,”IEEE Trans. ASSP, pp. 455–470, 1987.
K. Parhi, “Algorithm and architecture design for high speed digital signal processing,” Ph.D. Thesis, University of California, 1988.
K.K. Parhi, “Algorithm transformation technique for concurrent processors,”Proceedings of the IEEE, Vol. 77, No. 12, pp. 1879–1895. 1989.
Article Google Scholar
A. Fettweis, H. Meyr, and L. Thiele, “Algorithm Transformations for Unlimited Parallelism,”IEEE International Symposium on Circuits and Systems, pp. 1756–1759, New Orleans, 1990.
H.-D. Lin and D.G. Messerschmitt, “Finite State Machine has Unlimited Concurrency,”IEEE Trans. on Circuits and Systems, Vol. 38, pp. 465–475, 1991.
Article Google Scholar
H.-D. Lin, “Concurrency in Trellis Searching and Traversing Algorithms,” Ph.D. Thesis, University of California at Berkeley, 1991.
Google Scholar
R.F. Touzeau, “A FORTRAN Compiler for the FPS-164 Scientific Computer,”ACM SIGPLAN Symposium on Compiler Construction, pp. 48–57, 1984.
M.S. Lam,A Systolic Array Optimizing Compiler, Norwell, MA: Kluwer Academic Publishers, 1989.
Google Scholar
S. Jain, “Circular scheduling: A new technique to perform software pipelining,”ACM SIGPLAN'92 Conference on Programming Language Design and Implementation, Toronto, Ontario, Canada, pp. 219–228, 1991.
B.R. Rau, M. Lee, P.P. Tirumalai, and M.S. Schlansker, “Register Allocation for Software Pipelined Loops,”ACM SIGPLAN'92 Conference on Programming Language Design and Implementation, San Francisco, CA, pp. 283–299, 1992.
E. Gyrczyc, “Automatic Generation of Microsequenced Data Paths to Realize ADA Circuit Description,” Ph.D. Thesis, Carleton University, 1984.
N. Jouppi and D. Wall, “Available Instruction-Level Parallelism for Super-Scalar and Super-Pipelined Machines,”Proc. 3rd International Conf. on Architectural Support for Programming Languages and Operating Systems, Boston, pp. 272–282, May 1989.
D. Callahan, K. Kennedy, and A. Porterfield, “Software Prefetching,”ASPLOS-IV Fourth International Conference on Architectural Support for Programming Languages and Operating Systems, Santa Clara, CA, pp. 40–52.
J. Rabaey, C. Chu, P. Hoang, and M. Potkonjak, “Synthesis of Datapath Architectures,”Anatomy of a Silicon Compiler (R.W. Brodersen, ed.), Norwell, MA: Kluwer Academic Publishers, pp. 221–249, 1992.
P. Hilfinger and J. Rabaey, “DSP Specification Using the Silage Language,”Anatomy of a Silicon Compiler (R.W. Brodersen, ed.) pp. 199–220, Boston, MA: Kluwer Academic Publishers, 1992.
Chapter Google Scholar
M. Potkonjak and J. Rabaey, “Scheduling Algorithms for Hierarchical Data Control Flow Graphs,”International Journal of Circuits Theory and Applications, Vol. 20, No. 3, pp. 217–234, 1992.
Article Google Scholar
D. Messerschmitt, “Breaking The Recursive Bottleneck,”Performance Limits in Communication Theory and Practice (B.K. Szymanski, ed.), Norwell, MA: Kluwer Academic Publishers, 1988.
Google Scholar
M.R. Garey and D.S. Johnson,Computers and Intractability: A Guide to the Theory of NP-Completeness, New York, NY:W.H. Freeman and Company, 1979.
MATH Google Scholar
M. Potkonjak and J. Rabaey, “Retiming for Scheduling,”VLSI Signal Processing Workshop, San Diego, CA, Vol. IV, pp. 23–32, IEEE Press, 1990.
Google Scholar
M. Potkonjak, “Algorithms for High Level Synthesis: Resource Utilization Based Approach,” Ph.D. Dissertation, University of California, Berkeley, 1991.
Google Scholar
C.E. Leiserson, F.M. Rose, and J.B. Saxe, “Optimizing synchronous circuits by retiming,”Proceedings of the Third Conference on VLSI, pp. 23–36, Computer Science Press, 1983.
G. Goossens, R. Jain, J. Vandewalle, and H. De Man, “An optimal and flexible delay management technique for VLSI,”Computation and Combinational methods in System Theory (C.I. Byrnes, A. Lindquist, eds.) pp. 409–418, North Holland, 1986.
C.E. Leiserson and J.B. Saxe, “Retiming Synchronous Circuitry,”Algorithmica, Vol. 6, No. 1, pp. 5–35, 1991.
Article MathSciNet Google Scholar
S.Y. Kung,VLSI Array Processors, Englewood Cliffs, NJ: Prentice Hall, 1988.
Google Scholar
G.B. Danzig, W. Blattner, and M.R. Rao, “Finding a cycle in a graph with minimum cost to time ratio with application to a ship routing problem,”Theory of Graphs (P. Rosenstiehl, Ed.), pp. 77–84, New York, NY: Dunod, Paris and Gordon and Breach, 1967.
Google Scholar
S. Gerez, S. Heemstra de Groot, and O. Hermann, “A polynomial time algorithm for the computation of the iteration bound in recursive data flow graphs,”IEEE Trans. on Circuits and Systems, Part I: Fundamental Theory and Applications, Vol. 39, pp. 49–52, 1992.
Article Google Scholar
S.M. Heemstra de Groot, S.H. Gerez, and O.E. Herrman, “Range-Chart-Guided Iterative Data-Flow Graph Scheduling,”IEEE Trans. on Circuits and Systems: Fundamental Theory and Applications, Vol. 39, pp. 351–364, 1992.
Article MATH Google Scholar
K.K. Parhi and D.G. Messerschmitt, “Static rate-optimal scheduling of iterative data flow graphs via optimal unfolding,”IEEE Trans. on Computers, Vol. 40, pp. 178–195, 1991.
Article Google Scholar
M.C. McFarland, A.C. Parker, and R. Camposano, “Tutorial on High-Level Synthesis,”Proceedings of the 25th Design Automation Conference, Anaheim, CA, pp. 330–336, June 1988.
B. Efron,The Jackknife, the Bootstrap and Other Resampling Plans, Philadelphia, PA: SIAM, 1982.
Book Google Scholar
L. Breiman, J.H. Friedman, R.A. Olshen, and C.J. Stone,Classification and Regression Trees, Monterey, CA: Wadsworth & Brooks/Cole, 1984.
MATH Google Scholar
B.L. Van der Warden,Modern Algebra, New York, NY: Frederick Ungar, 1950.
Google Scholar
M. Potkonjak and J. Rabaey, “Optimizing the Resource Utilization Using Transformations,”Proc. IEEE ICCAD Conference, Santa Clara, 1991.
M. Potkonjak and J. Rabaey, “Pipelining: Just Another Transformation,”Proceedings 1992 Application Specific Array Processors, Oakland, CA, IEEE Computer Society Press, pp. 163–177, 1992.
Google Scholar
M. Potkonjak and J. Rabaey, “Pipelining: Just Another Transformation,” Technical Report 5510-92-01, NEC USA, Princeton, 1992.
Google Scholar
L.-F. Chao, A. LaPaugh, and E.H-M. Sha, “Rotation Scheduling: A Loop Pipelining Algorithm,”30th ACM/ IEEE Design Automation Conference, pp. 566–572, 1993.

Download references

Author information

Authors and Affiliations

C&C Research Laboratories, NEC, 4 Independence Way, 08540, Princeton, NJ
Miodrag Potkonjak
Dept. of EECS, University of California at Berkeley, 94720, Berkeley, CA
Jan Rabaey

Authors

Miodrag Potkonjak
View author publications
You can also search for this author in PubMed Google Scholar
Jan Rabaey
View author publications
You can also search for this author in PubMed Google Scholar

Additional information

This work was done while the first author was at the University of California, Berkeley.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Potkonjak, M., Rabaey, J. Optimizing throughput and resource utilization using pipelining: Transformation based approach. Journal of VLSI Signal Processing 8, 117–130 (1994). https://doi.org/10.1007/BF02109380

Download citation

Received: 28 December 1992
Revised: 11 August 1993
Published: 01 June 1994
Issue Date: June 1994
DOI: https://doi.org/10.1007/BF02109380

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Optimizing throughput and resource utilization using pipelining: Transformation based approach

Abstract

Access this article

Similar content being viewed by others

Quantum computing

Formal method for the synthesis of optimal topologies of computing systems based on the projective description of graphs

Can GPU performance increase faster than the code error rate?

References

Author information

Authors and Affiliations

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Optimizing throughput and resource utilization using pipelining: Transformation based approach

Abstract

Access this article

Similar content being viewed by others

Quantum computing

Formal method for the synthesis of optimal topologies of computing systems based on the projective description of graphs

Can GPU performance increase faster than the code error rate?

References

Author information

Authors and Affiliations

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation