Achieving spilling-friendly register file assignment for highly distributed register files

Lu, Chia-Han; Shih, Wen-Li; Wu, Chung-Ju; Lee, Jenq Kuen

doi:10.1007/s11227-014-1181-2

Achieving spilling-friendly register file assignment for highly distributed register files

Published: 22 August 2014

Volume 69, pages 1342–1362, (2014)
Cite this article

The Journal of Supercomputing Aims and scope Submit manuscript

Chia-Han Lu¹,
Wen-Li Shih¹,
Chung-Ju Wu¹ &
…
Jenq Kuen Lee¹

137 Accesses
1 Citation
Explore all metrics

Abstract

Distributed register file architectures divide registers into multiple sets, and it follows that the register files could be small. This can increase the frequency of spilling if register allocation encounters high register pressure, which will reduce the performance. That is, there is extra spilling to handle the pressure and results in performance decline. One of the factors that can produce high pressure is improper register file assignment. Register file assignment is a phase that assigns virtual registers to suitable register files and avoids communication costs. To reduce spilling in the phase of register file assignment, this paper proposes the SPIlling-FRiendly (SPIFR) method, which attempts to improve spilling by estimating the spilling cost from two aspects: assignment and spilling. We used MiBench and EEMBC benchmarks in experiments performed with the Open64-based compiler and a cycle-accurate instruction set simulator. The MiBench experimental results show that the SPIFR method improved the average cycle counts of the benchmarks by 6.0 %. For the kernels of the benchmarks, the method improved the average cycle counts by 20.5 % and reduced the average spilling ratio by 19.0 %. The results on the EEMBC benchmarks indicate that the method improved the cycle counts with the average speedup of 7.0 %, the speedup average of the kernel functions was 11.3 %, and the average reduction in the spilling ratio was 11.7 %, respectively. We conclude that the SPIFR method can reduce spilling and increase the performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Balancing Tracking Granularity and Parallelism in Many-Task Systems: The Horizons Approach

Article Open access 06 April 2024

Peter Thoman & Philip Salzmann

Breaking the von Neumann bottleneck: architecture-level processing-in-memory technology

Article 27 April 2021

Xingqi Zou, Sheng Xu, … Yinhe Han

Performance Analysis of Cache Memory in CPU

References

Capitanio A, Dutt N, Nicolau A (1992) Partitioned register files for VLIW’s: a preliminary analysis of tradeoffs. In: Proceedings of the 25th annual international symposium on microarchitecture (MICRO-25), pp 292–300, Portland, OR, 1–4 December 1992
TIC6 Tms320c64x Technical Overview. Technical report, Texas Instruments, February 2000
CEVA Ceva-x1620 Datasheet. Technical report, CEVA Inc., 2004
Gebhart M, Keckler SW, Dally WJ (2011) A compile-time managed multi-level register file hierarchy. In: Proceedings of the 44th annual IEEE/ACM international symposium on microarchitecture, pp 465–476
Chang D, Baron M (2004) Taiwan’s Roadmap to leadership in design. Microprocessor report, In-Stat/MDR
Lin Y-C, You Y-P, Lee JK (2006) Register Allocation for VLIW DSP Processors with Irregular Register Files. In: International workshop on compilers for parallel computing, January 2006
Lin Y-C, You Y-P, Lee JK (2007) PALF: compiler supports for irregular register files in clustered VLIW DSP processors. Concurr Comput: Pract Exp 19:1–16
Article MATH Google Scholar
You Y-P, Lee C-R, Lee JK (2006) Compilers for leakage power reductions. ACM Trans Des Autom Electr Syst 11(1):147–166
Article Google Scholar
You Y-P, Huang C-W, Lee JK (2005) A Sink-N-Hoist framework for leakage power reduction. ACM EMSOFT, September 2005
Chen P-S, Hwang Y-S, Ju RD-C, Lee JK (October 2004) Interprocedural probabilistic pointer analysis. IEEE Trans Parallel Distrib Syst 15(10):893–907
Lu C-H, Lin Y-C, You Y-P, Lee JK (2007) A local-conscious global register allocator for VLIW DSP processors with distributed register files. In: International workshop on compilers for parallel computing, January 2007
Lu C-H, Lin Y-C, You Y-P, Lee JK (2009) LC-GRFA: global register file assignment with local consciousness for VLIW DSP processors with non-uniform register files. Concurr Comput Pract Exp 21(1):101–114
Article Google Scholar
Lin Y-C, Tang C-L, Wu C-J, Hung M-Y, You Y-P, Moo Y-C, Chen S-Y, Lee JK (2005) Compiler supports and optimizations for PAC VLIW DSP processors. In: Proceedings of the 18th international workshop on languages and compilers for parallel computing
Lu F, Wang L, Feng X, Li Z, Zhang Z (2008) Exploiting idle register classes for fast spill destination. In: Proceedings of the 22nd annual international conference on supercomputing (Island of Kos, Greece, June 07–12, 2008)
Wu C-J, Lu C-H, JK Lee (2009) Expression rematerialization for VLIW DSP processors with distributed register file. In: 14th Workshop on compilers for parallel computing (CPC 2009), Zurich, Switzerland, January 2009
Chaitin GJ, Auslander MA, Chandra AK, Cocke J, Hopkins ME, Markstein PW (1981) Register allocation via coloring. Comput Lang 6:47–57
Article Google Scholar
Chaitin GJ (1982) Register allocation and spilling via graph coloring. In: Proceedings of the ACM SIGPLAN 1982 symposium on compiler, construction, pp 201–207
Bernstein D, Goldin DQ, Golumbic MC, Krawczyk H, Mansour Y, Nahshon I, Pinter RY (1989) Spill code minimization techniques for optimizing compilers. In: Conference on programming language design and implementation
Briggs P (1992) Register allocation via graph coloring. Doctoral Thesis, Rice University, Houston, TX
Briggs P, Cooper KD, Torczon L (1994) Improvements to graph coloring register allocation. ACM Trans Program Lang Syst (TOPLAS) 16(3):428–455
Article Google Scholar
Kolte P, Harrold MJ (1993) Load/store range analysis for global register allocation. In: Proceedings of programming language design and implementation
Bergner P, Dahl P, Engebretsen D, O’Keefe M (1997) Spill code minimization via interference region spilling. In: Proceedings of programming language design and implementation
Koseki A, Komatsu H, T Nakitani (2003) Spill code minimization by spill code motion. In: Proceedings of parallel architectures and compilation techniques
Ellis JR (1986) Bulldog: A compiler for VLIW Architectures. MIT Press, Cambridge
Google Scholar
Capitanio A, Dutt N, Nicolau A (1993) Design considerations for limited connectivity VLIW architectures. Technical, Report TR59-92
Ozer E, Banerjia S, Conte TM (1998) Unified assign and schedule: a new approach to scheduling for clustered register files micro architectures. In: Proceedings of the 31st annual international symposium on microarchitecture
Guthaus MR, Ringenberg JS, Ernst D, Austin TM, Mudge T, Brown RB (2001) MiBench: a free, commercially representative embedded benchmark suite. Workload characterization, 2001. WWC-4. 2001 IEEE International Workshop on Publication Date: 2 Dec. 2001
The Embedded Microprocessor Benchmark Consortium (EEMBC). http://www.eembc.org/index.php
Wu C-J, Chen S-Y, Lee JK (2006) Copy propagation optimizations for VLIW DSP processors with distributed register files. In: Proceedings of the 19th international workshop on languages and compilers for parallel computing
Chen C-K, Tseng L-H, Chen S-C, Lin Y-J, You Y-P, Lu C-H, Lee JK (2007) Enabling compiler flow for embedded VLIW DSP processors with distributed register files. ACM SIGPLAN Notices, vol 42, No. 7, pp 146–148 (ACM LCTES 2007 Issue)

Download references

Author information

Authors and Affiliations

Department of Computer Science, National Tsing Hua University, Hsinchu, 30013, Taiwan
Chia-Han Lu, Wen-Li Shih, Chung-Ju Wu & Jenq Kuen Lee

Authors

Chia-Han Lu
View author publications
You can also search for this author in PubMed Google Scholar
Wen-Li Shih
View author publications
You can also search for this author in PubMed Google Scholar
Chung-Ju Wu
View author publications
You can also search for this author in PubMed Google Scholar
Jenq Kuen Lee
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jenq Kuen Lee.

Additional information

Contract/grant sponsor: NSC; contract/grant number: 102-2220-E-007-001 and 102-2219-E-007-001. Contract/grant sponsor: Ministry of Economic Affairs; contract/grant number: 102-EC-17-A-02-S1-202.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lu, CH., Shih, WL., Wu, CJ. et al. Achieving spilling-friendly register file assignment for highly distributed register files. J Supercomput 69, 1342–1362 (2014). https://doi.org/10.1007/s11227-014-1181-2

Download citation

Published: 22 August 2014
Issue Date: September 2014
DOI: https://doi.org/10.1007/s11227-014-1181-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Achieving spilling-friendly register file assignment for highly distributed register files

Abstract

Access this article

Similar content being viewed by others

Balancing Tracking Granularity and Parallelism in Many-Task Systems: The Horizons Approach

Breaking the von Neumann bottleneck: architecture-level processing-in-memory technology

Performance Analysis of Cache Memory in CPU

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Achieving spilling-friendly register file assignment for highly distributed register files

Abstract

Access this article

Similar content being viewed by others

Balancing Tracking Granularity and Parallelism in Many-Task Systems: The Horizons Approach

Breaking the von Neumann bottleneck: architecture-level processing-in-memory technology

Performance Analysis of Cache Memory in CPU

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation