Article

Tetris: a new register pressure control technique for VLIW processors

Authors:

Russell TessierAuthors Info & Claims

LCTES '07: Proceedings of the 2007 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools for embedded systems

Pages 113 - 122

https://doi.org/10.1145/1254766.1254783

Published: 13 June 2007 Publication History

Abstract

The run-time performance of VLIW (very long instruction word) microprocessors depends heavily on the effectiveness of its associated optimizing compiler. Typical VLIW compiler phases include instruction scheduling, which maximizes instruction level parallelism (ILP), and register allocation, which minimizes data spills to external memory. If ILP is maximized without considering register constraints, high register pressure may result, leading to increased spill code and reduced run-time performance. In this paper, a new register pressure reduction technique for embedded VLIW processors is presented to control register pressure prior to instruction scheduling and register allocation. By modifying the relative ordering of operations, this technique restructures code to better reduce spills. Our technique has been implemented in Trimaran, an academic VLIW compiler, and evaluated using a series of VLIW benchmarks. Experimental results show that, on average, our algorithm reduces dynamic spills and improves overall cycle counts by 6% for a VLIW architecture with 8 functional units and 32 registers versus previous spill code reduction techniques.

References

[1]

D. A. Berson, R. Gupta, and M. L. Soffa. URSA: A Unified ReSource Allocator for Registers and Functional Units in VLIW Architectures. In IFIP Working Conference on Architectures and Compilation Techniques for Fine and Medium Grain Parallelism, pages 243--254, Jan. 1993.

Digital Library

[2]

D. A. Berson, R. Gupta, and M. L. Soffa. Integrated Instruction Scheduling and Register Allocation Techniques. In International Workshop on Languages and Compilers for Parallel Computing, pages 247--262, Aug. 1998.

Digital Library

[3]

P. Briggs. Register Allocation via Graph Coloring. PhD thesis, Department of Computer Science, Rice University, Apr. 1992.

Digital Library

[4]

P. Briggs, K. Cooper, K. Kennedy, and L. Torczon. Coloring Heuristics for Register Allocation. In ACM SIGPLAN Conference on Programming Language Design and Implementation, pages 275--284, June 1989.

Digital Library

[5]

G. Chaitin. Register Allocation and Spilling via Graph Coloring. In ACM SIGPLAN Symposium on Compiler Construction, pages 98--105, June 1982.

Digital Library

[6]

L. N. Chakrapani, J. Gyllenhaal, W. W. Hwu, S. A. Mahlke, K. V. Palem, and R. M. Rabbah. Trimaran, An Infrastructure for Research in Instruction Level Parallelism. In International Workshop on Languages and Compilers for High Performance Computing, pages 32--41, Sept. 2004.

[7]

T. H. Cormen, C. E. Leiserson, and R. L. Rivest. Introduction to Algorithms. McGraw-Hill Book Company, 1990.

Digital Library

[8]

R. P. Dilworth. A Decomposition Theorem for Partially Ordered Sets. Annals of Mathematics, 51(1):161--166, Jan. 1950.

[9]

Freescale Semiconductor, Inc. MSC8101 Reference Manual, 2005.

[10]

S.M. Freudenberger and J. C. Ruttenberg. Phase Ordering of Register Allocation and Instruction Scheduling. In International Workshop on Code Generation, pages 146--172, May 1991.

[11]

J. R. Goodman and W.-C. Hsu. Code scheduling and register allocation in large basic blocks. In ACM Supercomputing Conference, pages 442--452, July 1988.

Digital Library

[12]

R. Govindarajan, H. Yang, J. N. Amaral, C. Zhang, and G. R. Gao. Minimum Register Instruction Sequencing to Reduce Register Spills in Out-of-Order Issue Superscalar Architectures. IEEE Transactions on Computers, 52(1):4--20, Jan. 2003.

Digital Library

[13]

H. Kim. Region-based Register Allocation for EPIC Architectures. PhD thesis, Department of Computer Science, New York University, Jan. 2001.

Digital Library

[14]

C. Lee, M. Potkonjak, and W. H. Mangione-Smith. MediaBench: A Tool for Evaluating and Synthesizing Multimedia and Communicatons Systems. In International Symposium on Microarchitecture, pages 330--335, June 1997.

Digital Library

[15]

C. Norris and L. L. Pollock. A Scheduler-Sensitive Global Register Allocator. In ACM Supercomputing Conference, pages 804--813, July 1993.

Digital Library

[16]

S. S. Pinter. Register Allocation with Instruction Scheduling: A New Approach. In ACMSIGPLAN Conference on Programming Language Design and Implementation, pages 248--257, June 1993.

Digital Library

[17]

Texas Instruments, Inc. TMS320C6000 CPU and Instruction Set Reference Guide, 2000.

[18]

S.-A.-A. Touati. Register Saturation in Superscalar and VLIWCodes. In International Conference on Compiler Construction, pages 213--228, Apr. 2001.

Digital Library

[19]

S.-A.-A. Touati. Register Saturation in Instruction Level Parallelism. International Journal of Parallel Programming, 33(4):393--449, Aug. 2005.

Digital Library

[20]

Transmeta, Inc. Transmeta Efficeon TM8820 Processor, 2005.

Cited By

Dallant JIacono J(2024)How fast can we play Tetris greedily with rectangular pieces?Theoretical Computer Science10.1016/j.tcs.2024.114405992:COnline publication date: 21-Apr-2024
https://dl.acm.org/doi/10.1016/j.tcs.2024.114405
Samardzic NFeldmann AKrastev ADevadas SDreslinski RPeikert CSanchez D(2021)F1: A Fast and Programmable Accelerator for Fully Homomorphic EncryptionMICRO-54: 54th Annual IEEE/ACM International Symposium on Microarchitecture10.1145/3466752.3480070(238-252)Online publication date: 18-Oct-2021
https://dl.acm.org/doi/10.1145/3466752.3480070
Barany GKrall A(2013)Optimal and heuristic global code motion for minimal spillingProceedings of the 22nd international conference on Compiler Construction10.1007/978-3-642-37051-9_2(21-40)Online publication date: 16-Mar-2013
https://dl.acm.org/doi/10.1007/978-3-642-37051-9_2
Show More Cited By

Index Terms

Tetris: a new register pressure control technique for VLIW processors
1. Software and its engineering
  1. Software notations and tools
    1. Compilers

Recommendations

Tetris: a new register pressure control technique for VLIW processors
Proceedings of the 2007 LCTES conference

The run-time performance of VLIW (very long instruction word) microprocessors depends heavily on the effectiveness of its associated optimizing compiler. Typical VLIW compiler phases include instruction scheduling, which maximizes instruction level ...
Tetris-XL: A performance-driven spill reduction technique for embedded VLIW processors

As technology has advanced, the application space of Very Long Instruction Word (VLIW) processors has grown to include a variety of embedded platforms. Due to cost and power consumption constraints, many embedded VLIW processors contain limited ...
Tuning the continual flow pipeline architecture with virtual register renaming

Continual Flow Pipelines (CFPs) allow a processor core to process hundreds of in-flight instructions without increasing cycle-critical pipeline resources. When a load misses the data cache, CFP checkpoints the processor register state and then moves all ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

LCTES '07: Proceedings of the 2007 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools for embedded systems

June 2007

258 pages

ISBN:9781595936325

DOI:10.1145/1254766

General Chair:
Santosh Pande
Georgia Institute of Technology, USA
,
Program Chair:
Zhiyuan Li
Purdue University, USA

ACM SIGPLAN Notices Volume 42, Issue 7
Proceedings of the 2007 LCTES conference
July 2007
241 pages
ISSN:0362-1340
EISSN:1558-1160
DOI:10.1145/1273444
Issue’s Table of Contents

Copyright © 2007 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 June 2007

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Article

Conference

LCTES 07

Sponsor:

LCTES 07: ACM SIGBED-SIGPLAN Conference on Languages, Compilers and Tools for Embedded Systems

June 13 - 15, 2007

California, San Diego, USA

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

7
Total Citations
View Citations
344
Total Downloads

Downloads (Last 12 months)4
Downloads (Last 6 weeks)0

Reflects downloads up to 13 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Dallant JIacono J(2024)How fast can we play Tetris greedily with rectangular pieces?Theoretical Computer Science10.1016/j.tcs.2024.114405992:COnline publication date: 21-Apr-2024
https://dl.acm.org/doi/10.1016/j.tcs.2024.114405
Samardzic NFeldmann AKrastev ADevadas SDreslinski RPeikert CSanchez D(2021)F1: A Fast and Programmable Accelerator for Fully Homomorphic EncryptionMICRO-54: 54th Annual IEEE/ACM International Symposium on Microarchitecture10.1145/3466752.3480070(238-252)Online publication date: 18-Oct-2021
https://dl.acm.org/doi/10.1145/3466752.3480070
Barany GKrall A(2013)Optimal and heuristic global code motion for minimal spillingProceedings of the 22nd international conference on Compiler Construction10.1007/978-3-642-37051-9_2(21-40)Online publication date: 16-Mar-2013
https://dl.acm.org/doi/10.1007/978-3-642-37051-9_2
Ivanov D(2010)Register allocation with instruction scheduling for VLIW-architecturesProgramming and Computing Software10.1134/S036176881006005836:6(363-367)Online publication date: 1-Nov-2010
https://dl.acm.org/doi/10.1134/S0361768810060058
Xu WTessier R(2009)Tetris-XLACM Transactions on Architecture and Code Optimization10.1145/1582710.15827136:3(1-40)Online publication date: 2-Oct-2009
https://dl.acm.org/doi/10.1145/1582710.1582713
Beidas RMong WZhu J(2011)Register pressure aware scheduling for high level synthesisProceedings of the 16th Asia and South Pacific Design Automation Conference10.5555/1950815.1950911(461-466)Online publication date: 25-Jan-2011
https://dl.acm.org/doi/10.5555/1950815.1950911
Beidas RMong WZhu J(2011)Register pressure aware scheduling for high level synthesis16th Asia and South Pacific Design Automation Conference (ASP-DAC 2011)10.1109/ASPDAC.2011.5722234(461-466)Online publication date: Jan-2011
https://doi.org/10.1109/ASPDAC.2011.5722234

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten