Impact of Reverse Computing on Information Locality in Register Allocation for High Performance Computing

Bahi, Mouad; Eisenbeis, Christine

doi:10.1007/s10766-012-0212-y

Impact of Reverse Computing on Information Locality in Register Allocation for High Performance Computing

Published: 09 August 2012

Volume 42, pages 49–76, (2014)
Cite this article

International Journal of Parallel Programming Aims and scope Submit manuscript

Mouad Bahi^1,2,3 &
Christine Eisenbeis^1,2

173 Accesses
Explore all metrics

Abstract

Reversible computing aims at keeping all information on input and intermediate values available at any step of the computation, making information virtually present everywhere. Rematerialization in register allocation amounts to recomputing values instead of spilling them in memory when registers run out. In this paper we detail a heuristic algorithm for exploiting reverse computing for register materialization. This improves information locality as it provides more opportunities for retrieving data. Rematerialization adds instructions and we show on one specifically designed example that reverse computing may alleviate the impact of these additional instructions on performance. We also show how thread parallelism may be optimized on GPUs by performing register allocation with reverse recomputing that increases the number of threads per Streaming Multiprocessor. This is done on the main kernel of Lattice Quantum Chromo Dynamics simulation program where we gain a 11 % speedup.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

The Case for Polymorphic Registers in Dataflow Computing

Article Open access 10 May 2017

Supporting Data Shuffle Between Threads in OpenMP

Performance-Centric Optimization for Racetrack Memory Based Register File on GPUs

Article 08 January 2016

References

http://developer.nvidia.com/nvidia-gpu-computing-documentation
Bahi, M., Eisenbeis, C.: Rematerialization-based register allocation through reverse computing. In: Proceedings of the 8th ACM International Conference on Computing Frontiers, CF ’11, pp. 24:1–24:2. New York, NY, USA, ACM (2011)
Baker, H.G.: NREVERSAL of fortune—the thermodynamics of garbage collection. In: IWMM, pp. 507–524 (1992)
Bennett C.H.: Logical reversibility of computation. IBM J. Res. Dev. 17(6), 525–532 (1973)
Article MATH Google Scholar
Bennett C.H.: Time/space trade-offs for reversible computation. SIAM J. Comput. 18, 766–776 (1989)
Article MATH MathSciNet Google Scholar
Berson, D.A., Gupta, R., Soffa, M.L.: URSA: A unified resource allocator for registers and functional units in vliw architectures. In: PACT’93: Proceedings of the IFIP WG10.3. Working Conference on Architectures and Compilation Techniques for Fine and Medium Grain Parallelism, pp. 243–254. North-Holland Publishing Co., Amsterdam, The Netherlands (1993)
Bishop, P.G.: Using reversible computing to achieve fail-safety. In: Proceedings of the Eighth International Symposium on Software Reliability Engineering, ISSRE ’97, pp. 182–191. IEEE Computer Society, Washington, DC, USA (1997)
Bouchez, F.: A Study of Spilling and Coalescing in Register Allocation as Two Separate Phases. PhD thesis, ENS Lyon (2009)
Briggs, P.: Register allocation via graph coloring. PhD thesis Rice University, Houston TX USA (1992)
Briggs, P., Cooper, K.D., Torczon, L.: Rematerialization. In: Proceedings of the ACM SIGPLAN 1992 Conference on Programming Language Design and Implementation, PLDI ’92, pp. 311–321. ACM, New York, NY, USA (1992)
Carothers, C.D., Perumalla, K.S., Fujimoto, R.M.: Efficient optimistic parallel simulations using reverse computation. In: Proceedings of the Thirteenth Workshop on Parallel and Distributed Simulation, PADS ’99, pp. 126–135. IEEE Computer Society, Washington, DC, USA (1999)
Chaitin, G.J.: Register allocation & spilling via graph coloring. In: SIGPLAN ’82: Proceedings of the 1982 SIGPLAN Symposium on Compiler Construction, pp. 98–105. ACM, New York, NY, USA (1982)
Frank, M.P.: The R programming language and compiler, pp. 1–18. Memo M8 MIT AI Lab (1997)
Govindarajan R., Yang H., Amaral J.N., Zhang C., Gao G.R.: Minimum register instruction sequencing to reduce register spills in out-of-order issue superscalar architectures. IEEE Trans. Comput. 52(1), 4–20 (2003)
Article Google Scholar
Lutz, C., Derby, H.: Janus: a Time-Reversible Language. Caltech Class Project (1982)
Murphy, M.: Nvidia’s experience with open64. Open64 Workshop at CGO (2008)
Punjani, M.: Register rematerialization in gcc. In: GCC Developers’ Summit, pp. 131–139 (2004)
Simpson L.T.: Value-Driven Redundancy Elimination. PhD thesis, Rice University, Houston, TX, USA (1996)
Urbach C., Jansen K., Shindler A., Wenger U.: HMC algorithm with multiple time scale integration and mass preconditioning. Comput. Phys. Commun. 174, 87–98 (2006)
Article Google Scholar
Zhang, T., Zhuang, X., Pande, S.: Compiler optimizations to reduce security overhead. In: Proceedings of the International Symposium on Code Generation and Optimization, CGO ’06, pp. 346–357. IEEE Computer Society, Washington, DC, USA (2006)
Zhang Y., Kwon Y.-J., Lee H.J.: A systematic generation of initial register-reuse chains for dependence minimization. SIGPLAN Not. 36(2), 47–54 (2001)
Article Google Scholar

Download references

Author information

Authors and Affiliations

INRIA Saclay Île-de-France, Orsay, France
Mouad Bahi & Christine Eisenbeis
LRI, Université de Paris-Sud 11, Orsay, France
Mouad Bahi & Christine Eisenbeis
LERMA, Observatoire de Paris, Paris, France
Mouad Bahi

Authors

Mouad Bahi
View author publications
You can also search for this author in PubMed Google Scholar
Christine Eisenbeis
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mouad Bahi.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Bahi, M., Eisenbeis, C. Impact of Reverse Computing on Information Locality in Register Allocation for High Performance Computing. Int J Parallel Prog 42, 49–76 (2014). https://doi.org/10.1007/s10766-012-0212-y

Download citation

Received: 30 November 2011
Accepted: 25 July 2012
Published: 09 August 2012
Issue Date: February 2014
DOI: https://doi.org/10.1007/s10766-012-0212-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Impact of Reverse Computing on Information Locality in Register Allocation for High Performance Computing

Abstract

Access this article

Similar content being viewed by others

The Case for Polymorphic Registers in Dataflow Computing

Supporting Data Shuffle Between Threads in OpenMP

Performance-Centric Optimization for Racetrack Memory Based Register File on GPUs

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Impact of Reverse Computing on Information Locality in Register Allocation for High Performance Computing

Abstract

Access this article

Similar content being viewed by others

The Case for Polymorphic Registers in Dataflow Computing

Supporting Data Shuffle Between Threads in OpenMP

Performance-Centric Optimization for Racetrack Memory Based Register File on GPUs

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation