skip to main content
10.1145/1254766.1254783acmconferencesArticle/Chapter ViewAbstractPublication PagescpsweekConference Proceedingsconference-collections
Article

Tetris: a new register pressure control technique for VLIW processors

Published: 13 June 2007 Publication History

Abstract

The run-time performance of VLIW (very long instruction word) microprocessors depends heavily on the effectiveness of its associated optimizing compiler. Typical VLIW compiler phases include instruction scheduling, which maximizes instruction level parallelism (ILP), and register allocation, which minimizes data spills to external memory. If ILP is maximized without considering register constraints, high register pressure may result, leading to increased spill code and reduced run-time performance. In this paper, a new register pressure reduction technique for embedded VLIW processors is presented to control register pressure prior to instruction scheduling and register allocation. By modifying the relative ordering of operations, this technique restructures code to better reduce spills. Our technique has been implemented in Trimaran, an academic VLIW compiler, and evaluated using a series of VLIW benchmarks. Experimental results show that, on average, our algorithm reduces dynamic spills and improves overall cycle counts by 6% for a VLIW architecture with 8 functional units and 32 registers versus previous spill code reduction techniques.

References

[1]
D. A. Berson, R. Gupta, and M. L. Soffa. URSA: A Unified ReSource Allocator for Registers and Functional Units in VLIW Architectures. In IFIP Working Conference on Architectures and Compilation Techniques for Fine and Medium Grain Parallelism, pages 243--254, Jan. 1993.
[2]
D. A. Berson, R. Gupta, and M. L. Soffa. Integrated Instruction Scheduling and Register Allocation Techniques. In International Workshop on Languages and Compilers for Parallel Computing, pages 247--262, Aug. 1998.
[3]
P. Briggs. Register Allocation via Graph Coloring. PhD thesis, Department of Computer Science, Rice University, Apr. 1992.
[4]
P. Briggs, K. Cooper, K. Kennedy, and L. Torczon. Coloring Heuristics for Register Allocation. In ACM SIGPLAN Conference on Programming Language Design and Implementation, pages 275--284, June 1989.
[5]
G. Chaitin. Register Allocation and Spilling via Graph Coloring. In ACM SIGPLAN Symposium on Compiler Construction, pages 98--105, June 1982.
[6]
L. N. Chakrapani, J. Gyllenhaal, W. W. Hwu, S. A. Mahlke, K. V. Palem, and R. M. Rabbah. Trimaran, An Infrastructure for Research in Instruction Level Parallelism. In International Workshop on Languages and Compilers for High Performance Computing, pages 32--41, Sept. 2004.
[7]
T. H. Cormen, C. E. Leiserson, and R. L. Rivest. Introduction to Algorithms. McGraw-Hill Book Company, 1990.
[8]
R. P. Dilworth. A Decomposition Theorem for Partially Ordered Sets. Annals of Mathematics, 51(1):161--166, Jan. 1950.
[9]
Freescale Semiconductor, Inc. MSC8101 Reference Manual, 2005.
[10]
S.M. Freudenberger and J. C. Ruttenberg. Phase Ordering of Register Allocation and Instruction Scheduling. In International Workshop on Code Generation, pages 146--172, May 1991.
[11]
J. R. Goodman and W.-C. Hsu. Code scheduling and register allocation in large basic blocks. In ACM Supercomputing Conference, pages 442--452, July 1988.
[12]
R. Govindarajan, H. Yang, J. N. Amaral, C. Zhang, and G. R. Gao. Minimum Register Instruction Sequencing to Reduce Register Spills in Out-of-Order Issue Superscalar Architectures. IEEE Transactions on Computers, 52(1):4--20, Jan. 2003.
[13]
H. Kim. Region-based Register Allocation for EPIC Architectures. PhD thesis, Department of Computer Science, New York University, Jan. 2001.
[14]
C. Lee, M. Potkonjak, and W. H. Mangione-Smith. MediaBench: A Tool for Evaluating and Synthesizing Multimedia and Communicatons Systems. In International Symposium on Microarchitecture, pages 330--335, June 1997.
[15]
C. Norris and L. L. Pollock. A Scheduler-Sensitive Global Register Allocator. In ACM Supercomputing Conference, pages 804--813, July 1993.
[16]
S. S. Pinter. Register Allocation with Instruction Scheduling: A New Approach. In ACMSIGPLAN Conference on Programming Language Design and Implementation, pages 248--257, June 1993.
[17]
Texas Instruments, Inc. TMS320C6000 CPU and Instruction Set Reference Guide, 2000.
[18]
S.-A.-A. Touati. Register Saturation in Superscalar and VLIWCodes. In International Conference on Compiler Construction, pages 213--228, Apr. 2001.
[19]
S.-A.-A. Touati. Register Saturation in Instruction Level Parallelism. International Journal of Parallel Programming, 33(4):393--449, Aug. 2005.
[20]
Transmeta, Inc. Transmeta Efficeon TM8820 Processor, 2005.

Cited By

View all
  • (2024)How fast can we play Tetris greedily with rectangular pieces?Theoretical Computer Science10.1016/j.tcs.2024.114405992:COnline publication date: 21-Apr-2024
  • (2021)F1: A Fast and Programmable Accelerator for Fully Homomorphic EncryptionMICRO-54: 54th Annual IEEE/ACM International Symposium on Microarchitecture10.1145/3466752.3480070(238-252)Online publication date: 18-Oct-2021
  • (2013)Optimal and heuristic global code motion for minimal spillingProceedings of the 22nd international conference on Compiler Construction10.1007/978-3-642-37051-9_2(21-40)Online publication date: 16-Mar-2013
  • Show More Cited By

Index Terms

  1. Tetris: a new register pressure control technique for VLIW processors

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    LCTES '07: Proceedings of the 2007 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools for embedded systems
    June 2007
    258 pages
    ISBN:9781595936325
    DOI:10.1145/1254766
    • cover image ACM SIGPLAN Notices
      ACM SIGPLAN Notices  Volume 42, Issue 7
      Proceedings of the 2007 LCTES conference
      July 2007
      241 pages
      ISSN:0362-1340
      EISSN:1558-1160
      DOI:10.1145/1273444
      Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 13 June 2007

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. instruction level parallelism
    2. register pressure control
    3. very long instruction word (VLIW) processor

    Qualifiers

    • Article

    Conference

    Acceptance Rates

    Overall Acceptance Rate 116 of 438 submissions, 26%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)4
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 13 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)How fast can we play Tetris greedily with rectangular pieces?Theoretical Computer Science10.1016/j.tcs.2024.114405992:COnline publication date: 21-Apr-2024
    • (2021)F1: A Fast and Programmable Accelerator for Fully Homomorphic EncryptionMICRO-54: 54th Annual IEEE/ACM International Symposium on Microarchitecture10.1145/3466752.3480070(238-252)Online publication date: 18-Oct-2021
    • (2013)Optimal and heuristic global code motion for minimal spillingProceedings of the 22nd international conference on Compiler Construction10.1007/978-3-642-37051-9_2(21-40)Online publication date: 16-Mar-2013
    • (2010)Register allocation with instruction scheduling for VLIW-architecturesProgramming and Computing Software10.1134/S036176881006005836:6(363-367)Online publication date: 1-Nov-2010
    • (2009)Tetris-XLACM Transactions on Architecture and Code Optimization10.1145/1582710.15827136:3(1-40)Online publication date: 2-Oct-2009
    • (2011)Register pressure aware scheduling for high level synthesisProceedings of the 16th Asia and South Pacific Design Automation Conference10.5555/1950815.1950911(461-466)Online publication date: 25-Jan-2011
    • (2011)Register pressure aware scheduling for high level synthesis16th Asia and South Pacific Design Automation Conference (ASP-DAC 2011)10.1109/ASPDAC.2011.5722234(461-466)Online publication date: Jan-2011

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media