Cache Optimizations for Iterative Numerical Codes Aware of Hardware Prefetching

Weidendorfer, Josef; Trinitis, Carsten

doi:10.1007/11558958_111

Josef Weidendorfer¹⁹ &
Carsten Trinitis¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 3732))

Included in the following conference series:

International Workshop on Applied Parallel Computing

1389 Accesses

Abstract

Cache optimizations use code transformations to increase the locality of memory accesses and use prefetching techniques to hide latency. For best performance, hardware prefetching units of processors should be complemented with software prefetch instructions. A cache simulation enhanced with a hardware prefetcher is presented to run code for a 3D multigrid solver. Thus, cache misses not predicted can be handled via insertion of prefetch instructions. Additionally, Interleaved Block Prefetching (IBPF), is presented. Measurements show its potential.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Bekerman, M., Jourdan, S., Romen, R., Kirshenboim, G., Rappoport, L., Yoaz, A., Weiser, U.: Correlated Load-Address Predictors. In: Proceedings of the 26th International Symposium on Computer Architecture, May 1999, pp. 54–63 (1999)
Google Scholar
Berg, E., Hagersten, E.: SIP: Performance Tuning through Source Code Interdependence. In: Proceedings of the 8th International Euro-Par Conference (Euro-Par 2002), Paderborn, Germany, August 2002, pp. 177–186 (2002)
Google Scholar
Berg, S.G.: Cache prefetching. Technical Report UW-CSE 02-02-04, University of Washington (February 2002)
Google Scholar
Beyls, K., D’Hollander, E.H.: Platform-Independent Cache Optimization by Pinpointing Low-Locality Reuse. In: Proceedings of International Conference on Computational Science, June 2004, vol. 3, pp. 463–470 (2004)
Google Scholar
Brandes, T.: Adaptor. homepage, http://www.scai.fraunhofer.de/291.0.html
Buck, B., Hollingsworth, J.K.: An API for Runtime Code Patching. The International Journal of High Performance Computing Applications 14, 317–329 (2000)
Article Google Scholar
DeRose, L., Ekanadham, K., Hollingsworth, J.K., Sbaraglia, S.: SIGMA: A Simulator Infrastructure to Guide Memory Analysis. In: Proceedings of SC 2002, Baltimore, MD (November 2002)
Google Scholar
Dynaprof Homepage, http://www.cs.utk.edu/mucci/dynaprof
Hsiao, H.C., King, C.T.: MICA: A Memory and Interconnect Simulation Environment for Cache-based Architectures. In: Proceedings of the 33rd IEEE Annual Simulation Symposium (SS 2000), April 2000, pp. 317–325 (2000)
Google Scholar
Intel Corporation. IA-32 Intel Architecture: Software Developers Manual
Google Scholar
Kowarschik, M., Rüde, U., Thürey, N., Weiß, C.: Performance Optimization of 3DMultigrid on Hierarchical Memory Architectures. In: Fagerholm, J., Haataja, J., Järvinen, J., Lyly, M., Råback, P., Savolainen, V. (eds.) PARA 2002. LNCS, vol. 2367, pp. 307–316. Springer, Heidelberg (2002)
Google Scholar
Kowarschik, M., Weiß, C.: An Overview of Cache Optimization Techniques and Cache- Aware Numerical Algorithms. In: Meyer, U., Sanders, P., Sibeyn, J.F. (eds.) Algorithms for Memory Hierarchies. LNCS, vol. 2625, pp. 213–232. Springer, Heidelberg (2003)
Chapter Google Scholar
Levon, J.: OProfile, a system-wide profiler for Linux systems, Homepage: http://oprofile.sourceforge.net
Martonosi, M., Gupta, A., Anderson, T.E.: Memspy:Analyzingmemory system bottlenecks in programs. In: Measurement and Modeling of Computer Systems, pp. 1–12 (1992)
Google Scholar
Nethercote, N., Seward, J.: Valgrind: A Program Supervision Framework. In: Proceedings of the Third Workshop on Runtime Verification (RV 2003), Boulder, Colorado, USA (July 2003), Available at http://developer.kde.org/~sewardj
Pai, V.S., Ranganathan, P., Adve, S.V., Harton, T.: An Evaluation of Memory Consistency Models for Shared-Memory Systems with ILP Processors. In: Proceedings of the Seventh International Conference on Architectural Support for Programming Languages and Operating Systems, October 1996, pp. 12–23 (1996)
Google Scholar
Thürey, N.: Cache Optimizations for Multigrid in 3D. Lehrstuhl für Informatik 10 (Systemsimulation), Institut für Informatik, University of Erlangen-Nuremberg, Germany (June 2002) Studienarbeit
Google Scholar
Weidendorfer, J., Kowarschik, M., Trinitis, C.: A Tool Suite for Simulation Based Analysis of Memory Access Behavior. In: Proceedings of International Conference on Computational Science, June 2004, vol. 3, pp. 455–462 (2004)
Google Scholar

Download references

Author information

Authors and Affiliations

Technische Universität München, Germany
Josef Weidendorfer & Carsten Trinitis

Authors

Josef Weidendorfer
View author publications
You can also search for this author in PubMed Google Scholar
Carsten Trinitis
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Computer Science Department, University of Tennessee, 37996-3450, Knoxville, TN, USA
Jack Dongarra
Department of Informatics and Mathematical Modelling, Technical University of Denmark, DK-2800, Lyngby, Denmark
Kaj Madsen
Informatics & Mathematical Modeling, Technical University of Denmark, DK-2800, Lyngby, Denmark
Jerzy Waśniewski

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Weidendorfer, J., Trinitis, C. (2006). Cache Optimizations for Iterative Numerical Codes Aware of Hardware Prefetching. In: Dongarra, J., Madsen, K., Waśniewski, J. (eds) Applied Parallel Computing. State of the Art in Scientific Computing. PARA 2004. Lecture Notes in Computer Science, vol 3732. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11558958_111

Download citation

DOI: https://doi.org/10.1007/11558958_111
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-29067-4
Online ISBN: 978-3-540-33498-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics