Automatic Pre-Fetch and Modulo Scheduling Transformations for the Cell BE Architecture

Vujić, Nikola; Gonzàlez, Marc; Martorell, Xavier; Ayguadé, Eduard

doi:10.1007/978-3-540-89740-8_3

Nikola Vujić²,
Marc Gonzàlez²,
Xavier Martorell² &
…
Eduard Ayguadé²

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 5335))

Included in the following conference series:

International Workshop on Languages and Compilers for Parallel Computing

886 Accesses
4 Citations

Abstract

Ease of programming is one of the main impediments for the broad acceptance of multi-core systems with no hardware support for transparent data transfer between local and global memories. Software cache is a robust approach to provide the user with a transparent view of the memory architecture; but this software approach can suffer from poor performance. In this paper, we propose a hierarchical, hybrid software-cache architecture that targets enabling pre-fetch techniques. Memory accesses are classified at compile time in two classes, high-locality and irregular. Our approach then steers the memory references toward one of two specific cache structures optimized for their respective access pattern. The specific cache structures are optimized to enable high-level compiler optimizations to aggressively unroll loops, reorder cache references, and/or transform surrounding loops so as to practically eliminate the software cache overhead in the innermost loop. The cache design enables automatic pre-fetch and modulo scheduling transforma-tions. Performance evaluation indicates that the optimized software-cache structures combined with the proposed pre-fetch techniques translate into speed-up between 10% and 20%. Evaluation is done on a set of parallel NAS applications.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Peter Hofstee, H.: Power Efficient Processor Architecture and The Cell Processor. In: Proceedings of the 11^th Int’l. Symposium on High-Performance Computer Architecture (2005)
Google Scholar
Pham, D., et al.: The Design and Implementation of a First-Generation Cell Processor. In: Proceedings the IEEE International Solid-State Circuits Conference (2005)
Google Scholar
Kistler, M., et al.: Cell Multiprocessor Communication Network: Built for Speed. IEEE Micro 26(3), 10–23 (2006)
Article MathSciNet Google Scholar
Gschwind, M., et al.: A Novel SIMD Architecture for the Cell Heterogeneous Chip-Multiprocessor. In: Hot Chips, vol. 17 (2005)
Google Scholar
Eichenberger, A.E., et al.: Using advanced compiler technology to exploit the performance of the Cell Broadband Engine architecture. IBM Systems Journal 45(1) (2006)
Google Scholar
McCalpin, John, D.: Memory Bandwidth and Machine Balance in Current High Performance Computers. IEEE Computer Society Technical Committee on Computer Architecture (TCCA) (1995)
Google Scholar
Ramakrishna Rau, B., et al.: Code Generation Schema for Modulo Scheduling Loops. In: Proccedings of the 25th Annual International Symposium on Microarchitecture (1992)
Google Scholar
Ramakrishna Rau, B., et al.: Iterative Modulo Scheduling: An Algorithm for Software Pipelining Loops. In: Proceedings of the 27^th annual International Symposium on Microarchitecture (1994)
Google Scholar
Lavery, D.M.: Modulo Scheduling of Loops in Control-intensive Non-numeric Programs. In: Proceedings of the 29^th annual ACM/IEEE International Symposium on Microarchitecture (1996)
Google Scholar
Bailey, D., et al.: The NAS parallel benchmarks. Technical Report TR RNR-91-002, NASA Ames (August 1991)
Google Scholar
Sinharoy, B., et al.: POWER 5 system micro-architecture. IBM Journal of Research and Development 49(4/5) (July/September 2005)
Google Scholar
Chen, T., et al.: Prefetching irregular references for software cache on cell. In: Proceedings of the sixth annual IEEE/ACM international symposium on Code Generation and Optimization, pp. 155–164 (2008)
Google Scholar
Dasygenis, M., et al.: A Combined DMA and Application-Specific Prefetching Approach for Tackling the Memory Bottleneck. IEEE Transactions on Very Large Integration (VLSI) Systems 14(3), 279–291 (2006)
Article Google Scholar
Chen, T.-F.: An Effective Programmable Prefetch Engine for On-Chip Caches. In: Proceedings of the 28^th Annual International Symposium on Microarchitecture (1995)
Google Scholar
Batcher, K.W., et al.: Interrupt Triggered Software Prefetching for Embedded CPU Instruction Cache. In: Proceedings of the 12^th IEEE Real-Time and Embedded Technology and Applications Symposium (2006)
Google Scholar

Download references

Author information

Authors and Affiliations

Barcelona Supercomputing Center Department of Computer Architecture, Technical University of Catalonia, Spain
Nikola Vujić, Marc Gonzàlez, Xavier Martorell & Eduard Ayguadé

Authors

Nikola Vujić
View author publications
You can also search for this author in PubMed Google Scholar
Marc Gonzàlez
View author publications
You can also search for this author in PubMed Google Scholar
Xavier Martorell
View author publications
You can also search for this author in PubMed Google Scholar
Eduard Ayguadé
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computing Science, University of Alberta, T6G-2E8, Edmonton, AB, Canada
José Nelson Amaral

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Vujić, N., Gonzàlez, M., Martorell, X., Ayguadé, E. (2008). Automatic Pre-Fetch and Modulo Scheduling Transformations for the Cell BE Architecture. In: Amaral, J.N. (eds) Languages and Compilers for Parallel Computing. LCPC 2008. Lecture Notes in Computer Science, vol 5335. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-89740-8_3

Download citation

DOI: https://doi.org/10.1007/978-3-540-89740-8_3
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-89739-2
Online ISBN: 978-3-540-89740-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics