research-article

SARA: StreAm register allocation

Authors:

Praveen Raghavan,

Francky CatthoorAuthors Info & Claims

CODES+ISSS '09: Proceedings of the 7th IEEE/ACM international conference on Hardware/software codesign and system synthesis

Pages 41 - 50

https://doi.org/10.1145/1629435.1629442

Published: 11 October 2009 Publication History

Abstract

Low power design criteria for embedded systems have lead to many innovative architectures. One of the core architectural changes that have come in the recent past are streaming registers. These architectures have been shown to be both power efficient and performance efficient. However code has to be efficiently mapped on them to make maximal use of their potential. This paper introduces a novel technique for compiling C code on streaming registers. The proposed technique not only uses the temporal locality in arrays but also spatial locality to map code on streaming registers. The proposed Stream Register Allocation (SARA) technique is also shown to provide good mapping efficiency as well as it is shown to be scalable on realistic applications.

References

[1]

P. Raghavan, A. Lambrechts, M.Jayapala, F.Catthoor, D.Verkest and H. Corporaal. Very Wide Register: An Asymmetric Register File Organization for Low Power Embedded Processors. In Proc of DATE, 2007.

Digital Library

[2]

N. Jayasena, M. Erez, J.H. Anh, and W.J. Dally. Stream register files with indexed access. In HPCA, pages 60--72, February 2004.

Digital Library

[3]

D. Nuzman, M. Namolaru, A. Zaks, and J.H. Derby. Compiling for an indirect vector register architecture. In Proc of CF, pages 199--205, May 2008.

Digital Library

[4]

P. Feautrier. Dataflow analysis of array and scalar references. International Journal of Parallel Programming, 20(1):23--53, 1991.

Digital Library

[5]

Michael E. Wolf and Monica S. Lam. A data locality optimizing algorithm. In PLDI '91: Proceedings of the ACM SIGPLAN 1991 conference on Programming language design and implementation, pages 30--44, New York, NY, USA, 1991. ACM Press.

Digital Library

[6]

U. Banerjee. Data Dependencies. Kluwer Aacdemic Publishers, 1988.

[7]

GCC, the GNU Compiler Collection. http://gcc.gnu.org, 2007.

[8]

Paul Feautrier. Dataflow analysis of array and scalar references. International Journal of Parallel Programming, 20(1):23--51, Feb 1991.

Digital Library

[9]

Kathleen Knobe and Vivek Sarkar. Array SSA form and its use in parallelization. In Proceedings of the 25th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, pages 107--120, 1998.

Digital Library

[10]

Carl Offner and Kathleen Knobe. Weak dynamic single assignment form. Technical Report TR-HPL-2003-169, HP Labs, Nov 2003.

[11]

Peter Vanbroekhoven, Gerda Janssens, Maurice Bruynooghe, Henk Corporaal, and Francky Catthoor. A step towards a scalable dynamic single assignment conversion. Technical Report CW 360, Department of Computer Science, Katholieke Universiteit Leuven, Apr 2003.

[12]

Wei Li and Keshav Pingali. Access normalization: loop restructuring for numa computers. ACM Trans. Comput. Syst., 11(4):353--375, 1993.

Digital Library

[13]

Dattatraya Kulkarni and Michael Stumm. Loop and Data Transformations: A tutorial, 93.

[14]

Mahmut Taylan Kandemir and J. Ramanujam. Data relation vectors: A new abstraction for data optimizations. volume 50, pages 798--810, August 2001.

Digital Library

[15]

J.Absar, P.Raghavan, A.Lambrechts, M.Li, M.Jayapala and F.Catthoor. Locality Optimizations in a Compiler for Wireless Applications. In Journal of Design Automation of Embedded Systems (DAEM), April 2008.

[16]

Javed Absar. PhD thesis, Locality Optimization in a Compiler for Embedded Systems. IMEC vzw, ESAT, KULeuven, July 2007

[17]

Peter Marwedel. Embedded System Design. Kluwer Academic Publishers (Springer), Norwell, MA, USA, 2003.

Digital Library

[18]

Preeti Ranjan Panda, Alexandru Nicolau, and Nikil Dutt. Memory Issues in Embedded Systems-on-Chip: Optimizations and Exploration. Kluwer Academic Publishers, Norwell, MA, USA, 1998.

Digital Library

[19]

Michael E. Wolf and Monica S. Lam. A data locality optimizing algorithm. In PLDI '91: Proceedings of the ACM USA, 1991. ACM Press.

Digital Library

[20]

Janis Sermulins, William Thies, Rodric Rabbah, and Saman Amarasinghe. Cache aware optimization of stream programs. In LCTES'05: Proceedings of the 2005 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools for embedded systems, pages 115--126, New York, NY, USA, 2005. ACM Press.

Digital Library

[21]

M. Bruynooghe, S. Verdoolaege, G. Janssens, and F. Catthoor. Multi-dimensional incremental loop fusion for data locality. In Proc of ASAP, pages 17--27, 2003.

[22]

Dattatraya Kulkarni. Transformations for improving data access locality in non-perfectly nested loops. In Proc of Seventh International Conference on Parallel Architectures and Compilation Techniques, pages 314--321, 1998.

Digital Library

[23]

Sylvain Girbal, Nicolas Vasilache, Cedric Bastoul, Albert Cohen, David Parello, March Sigler, and Olivier Temam. Semi-automatic composition of loop transformations for deep parallelism and memory hierarchies. In International Journal of Parallel Programming, pages 261--317, October 2006.

Digital Library

[24]

C. Bastoul. Code generation in the polyhedral model is easier than you think. In PACT'13 IEEE International Conference on Parallel Architecture and Compilation Techniques, pages 7--16, september 2004.

Digital Library

[25]

G Chaitin. Register allocation and spilling via graph coloring. In Proc of Compiler Construction, 1982.

Digital Library

[26]

Gregory Chaitin. Register allocation and spilling via graph coloring. SIGPLAN Not., 39(4):66--74, 2004.

Digital Library

[27]

Yumin Zhang and Danny Z. Chen. Efficient global register allocation for minimizing energy consumption. SIGPLAN Not., 37(4):42--53, 2002.

Digital Library

[28]

Fernando Magno Quintao Pereira and Jens Palsberg. Register allocation by puzzle solving. In PLDI '08: Proceedings of the 2008 ACM SIGPLAN conference on Programming language design and implementation, pages 216--226, New York, NY, USA, 2008. ACM.

Digital Library

[29]

Guei-Yuan Lueh, Thomas Gross, and Ali-Reza Adl-Tabatabai. Fusion-based register allocation. ACM Trans. Program. Lang. Syst., 22(3):431--470, 2000.

Digital Library

[30]

Preston Briggs, Keith D. Cooper, Ken Kennedy, and Linda Torczon. Coloring heuristics for register allocation. SIGPLAN Not., 39(4):283--294, 2004.

Digital Library

[31]

Li Wang, Xuejun Yang, Jingling Xue, Yu Deng, Xiaobo Yan, Tao Tang, and Quan Hoang Nguyen. Optimizing scientific application loops on stream processors. In LCTES '08: Proceedings of the 2008 ACM SIGPLAN-SIGBED conference on Languages, compilers, and tools for embedded systems, pages 161--170, New York, NY, USA, 2008. ACM.

Digital Library

[32]

Abhishek Das, William J. Dally, and Peter Mattson. Compiling for stream processing. In PACT '06: Proceedings of the 15th international conference on Parallel architectures and compilation techniques, pages 33--42, New York, NY, USA, 2006. ACM.

Digital Library

[33]

Eddy De Greef. Storage Size Reduction for Multimedia Applications. PhD thesis, Department of Electrical Engineering (ESAT), KULeuven,Belgium, 1998.

[34]

J.H. Derby, R.K.Montoye, and J. Moreira. Victoria -- vmx indirect compute technology oriented towareds in-line acceleration. In Proc of CF, pages 303--311, May 2006.

Digital Library

Index Terms

SARA: StreAm register allocation
1. Software and its engineering
  1. Software notations and tools
    1. Compilers

Recommendations

Allocating architected registers through differential encoding

Micro-architecture designers are very cautious about expanding the number of architected and exposed registers in the instruction set because increasing the register field adds to the code size, raises the I-cache and memory pressure, and may complicate ...
Machine-Description Driven Compilers for EPIC and VLIW Processors

In the past, due to the restricted gate count available on an inexpensive chip, embedded DSPs have had limited parallelism, few registers and irregular, incomplete interconnectivity. More recently, with increasing levels of integration, embedded VLIW ...
Exploiting virtual registers to reduce pressure on real registers

It is well known that a large fraction of variables are short-lived. This paper proposes a novel approach to exploiting this fact to reduce the register pressure for pipelined processors with data-forwarding network. The idea is that the compiler can ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

CODES+ISSS '09: Proceedings of the 7th IEEE/ACM international conference on Hardware/software codesign and system synthesis

October 2009

498 pages

ISBN:9781605586281

DOI:10.1145/1629435

Program Chairs:
Wolfgang Rosenstiel
University of Tübingen, Germany
,
Kazutoshi Wakabayashi
NEC, Japan

Copyright © 2009 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 11 October 2009

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

ESWeek '09

Sponsor:

ESWeek '09: Fifth Embedded Systems Week

October 11 - 16, 2009

Grenoble, France

Acceptance Rates

Overall Acceptance Rate 280 of 864 submissions, 32%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
226
Total Downloads

Downloads (Last 12 months)1
Downloads (Last 6 weeks)0

Reflects downloads up to 08 Mar 2025

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten