Integer loop code generation for VLIW

Radigan, James; Chang, Pohua; Banerjee, Utpal

doi:10.1007/BFb0014208

Integer loop code generation for VLIW

James Radigan¹,
Pohua Chang¹ &
Utpal Banerjee¹

Conference paper
First Online: 01 January 2005

116 Accesses
1 Citations

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1033))

Abstract

Code generation for complex integer loops within the context of a VLIW architecture, has to date, been handled by several disparate methodologies. We provide an empirical study to characterize what a typical complex integer loop is and propose a general solution that optimally modifies the key control dependencies in common integer loops. This single algorithm, integrates several software techniques (assuming key architectural features) in order to provide for varying degrees of nested complex control flow. A number of techniques, including loop peeling, loop unrolling, software pipelining, if-conversion, and procedure inlining are combined cohesively to make the best transformation decisions, for a typical integer loop, before scheduling and register allocation. Optimal fusion and distribution decisions are assumed.

This is a preview of subscription content, log in via an institution.

Preview

Unable to display preview. Download preview PDF.

References

M. Lam, “Software pipelining: an effective scheduling technique for VLIW machines,” in Proceedings of the SIGPLAN'88 Conference on Programming Language Design and Implementation, pg. 318–328, June 1988.
Google Scholar
N. J. Warter, G. E. Haab, K. Subramanian, and J. W. Bockhaus, “Enhanced modulo scheduling for loops with conditional branches,” in Proceedings of International Symposium on Microarchitecture, Dec. 1992.
Google Scholar
J. C. Dehnert, P. Y.-T. Hsu, and J. P. Bratt, “Overlapping loop support in the Cydra5,” in Proc. 3rd Intl. Conf. on Arch. Support for Prog. Lang. and Oper. Syst., pg.26–38, April 1989.
Google Scholar
Qi Ning, Guang R. Gao, “A noval framework of register allocation for software pipelining,” in Proc. of Twentieth Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, pg.29–42, Jan. 1993.
Google Scholar
Alexandre E. Eichenberger, Edward S. Davidson, and Santosh G. Abraham, “Minimum register requirements for a modulo schedule,” in Proc. of 27th Annual International Symposium on Microarchitecture, Nov. 1994.
Google Scholar
R. Ramakrishna Rau, “Iterative modulo scheduling: an algorithm for software pipelining loops,” in Proc. of 27th Annual International Symposium on Microarchitecture, Nov. 1994.
Google Scholar
B. R. Rau, M. Lee, P. P. Tirumalai, and M. S. Schlansker, “Register allocation for software pipelined loops,” in Proc. of SIGPLAN'92 Conf. on Programming Language Design and Implementation, pg. 283–299, June 1992.
Google Scholar
R. F. Touzeau, “A Fortran compiler for the FPS-164 scientific computer,” in Proc. of the ACM SIGPLAN'84 Symposium on Compiler Construction, pg. 48–57, SIGPLAN Notices Vol. 19, No. 6, June 1984.
Article Google Scholar
K. Ebcioglu, “A compilation technique for software pipelining of loops with conditional jumps,” in Proc. of the 20th Annual Workshop on Microprogramming, pg.69–79, Dec. 1987.
Google Scholar
P. Tirumalai, M. Lee, and M. S. Schlansker, “Parallelization of loops with exits on pipelined architecture,” in Proc. of the Supercomputing'90, pg. 200–212, Nov. 1990.
Google Scholar
D. Bernstein and Y. Lavon, “A software pipelining algorithm based on global instruction scheduling,” Technical Report 88.338, Science and Technology, IBM Israel, Sep. 1993.
Google Scholar
T. Ball and J. R. Larus, “Branch prediction for free,” in Proceedings of the ACM SIGPLAN 1993 Conference on Programming Language Design and Implementation, pg. 300–313, June 1993.
Google Scholar
J. A. Fisher and S. M. Freudenberger, “Predicting conditional branch directions from previous runs of a program,” in Proceedings of 5th International Conference on Architectural Support for Programming Languages and Operating Systems, pg. 85–95, Oct. 1992.
Google Scholar
J. A. Fisher, “Trace scheduling: a technique for global microcode compaction,” IEEE Transactions on Computers, Vol. C-30, pg. 478–490, July 1981.
Google Scholar
R. Allen and S. Johnson, “Compiling C for vectorization, parallelization, and inline expansion,” in Proceedings of the SIGPLAN'88 Conference on Programming Language Design and Implementation, June 1988.
Google Scholar
R. W. Scheifler, “An analysis of inline substitution for a structured programming language,” Communications of the ACM, Vol. 20, No. 9, Sep. 1977.
Google Scholar
C. A. Huson, “An in-line subroutine expander for Parafrase,” University of Illinois, Champaign-Urbana, 1982.
Google Scholar
J. W. Davidson and A. M. Holler, “A study of a C function inliner,” Software-Practice and Experience, Vol. 18(8), pg. 775–790, Aug. 1988.
Google Scholar
P. Chang, S. A. Mahlke, W. Y. Chen, and W.-M. W. Hwu, “Profile-guided automatic inline expansion for C programs,” Software Practice and Experience, May 1992.
Google Scholar
J. R. Allen, K. Kennedy, C. Porterfield, and J. Warren, “Conversion of control dependence to data dependence,” in Proceedings of the 10th ACM Symposium on Principles of Programming Languages, pg. 177–189, Jan. 1983.
Google Scholar
J. C. Park and M. S. Schlansker, “On predicated execution,” Tech. Report. HPL-91-58, Hewlett Packard Laboratories, May 1991.
Google Scholar
S. A. Mahlke, D. C. Lin, W. Y. Chen, R. E. Hank, and R. A. Bringmann, “Effective compiler support for predicated execution using the hyperblock,” in Proceedings of the 25th International Symposium on Microarchitecture, pg. 45–54, Dec. 1992.
Google Scholar
H. C. Young and J. R. Goodman, “A simulation study of architectural data queues and prepare-to-branch instruction,” in Proceedings of the IEEE International Conference on Computer Design: VLSI in Computers ICCD'84, pg. 544–549.
Google Scholar
P. Chang, S. A. Mahlke, W. Y. Chen, N. J. Warter, and W.-M. Hwu, “IMPACT: an architectural framework for multiple-instruction-issue processors,” in Proceedings of the 18th International Symposium on Computer Architecture, May 1991.
Google Scholar
S. A. Mahlke, W. Y. Chen, J. C. Gyllenhaal, W.-M. W. Hwu, P. Chang, and T. Kiyohara, “Compiler code transformations for superscalar-based high-performance systems,” in Proceedings of Supercomputing'92, November 1992.
Google Scholar

Download references

Author information

Authors and Affiliations

Intel Architecture Lab, M/S RN6-18 2200 Mission College Blvd., 95052-8119, Santa Clara, CA
James Radigan, Pohua Chang & Utpal Banerjee

Authors

James Radigan
View author publications
You can also search for this author in PubMed Google Scholar
Pohua Chang
View author publications
You can also search for this author in PubMed Google Scholar
Utpal Banerjee
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Chua-Huang Huang Ponnuswamy Sadayappan Utpal Banerjee David Gelernter Alex Nicolau David Padua

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Radigan, J., Chang, P., Banerjee, U. (1996). Integer loop code generation for VLIW. In: Huang, CH., Sadayappan, P., Banerjee, U., Gelernter, D., Nicolau, A., Padua, D. (eds) Languages and Compilers for Parallel Computing. LCPC 1995. Lecture Notes in Computer Science, vol 1033. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0014208

Download citation

DOI: https://doi.org/10.1007/BFb0014208
Published: 09 June 2005
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-60765-6
Online ISBN: 978-3-540-49446-1
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics