Skip to main content

Cache Optimizations for Iterative Numerical Codes Aware of Hardware Prefetching

  • Conference paper
Applied Parallel Computing. State of the Art in Scientific Computing (PARA 2004)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 3732))

Included in the following conference series:

  • 1389 Accesses

Abstract

Cache optimizations use code transformations to increase the locality of memory accesses and use prefetching techniques to hide latency. For best performance, hardware prefetching units of processors should be complemented with software prefetch instructions. A cache simulation enhanced with a hardware prefetcher is presented to run code for a 3D multigrid solver. Thus, cache misses not predicted can be handled via insertion of prefetch instructions. Additionally, Interleaved Block Prefetching (IBPF), is presented. Measurements show its potential.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bekerman, M., Jourdan, S., Romen, R., Kirshenboim, G., Rappoport, L., Yoaz, A., Weiser, U.: Correlated Load-Address Predictors. In: Proceedings of the 26th International Symposium on Computer Architecture, May 1999, pp. 54–63 (1999)

    Google Scholar 

  2. Berg, E., Hagersten, E.: SIP: Performance Tuning through Source Code Interdependence. In: Proceedings of the 8th International Euro-Par Conference (Euro-Par 2002), Paderborn, Germany, August 2002, pp. 177–186 (2002)

    Google Scholar 

  3. Berg, S.G.: Cache prefetching. Technical Report UW-CSE 02-02-04, University of Washington (February 2002)

    Google Scholar 

  4. Beyls, K., D’Hollander, E.H.: Platform-Independent Cache Optimization by Pinpointing Low-Locality Reuse. In: Proceedings of International Conference on Computational Science, June 2004, vol. 3, pp. 463–470 (2004)

    Google Scholar 

  5. Brandes, T.: Adaptor. homepage, http://www.scai.fraunhofer.de/291.0.html

  6. Buck, B., Hollingsworth, J.K.: An API for Runtime Code Patching. The International Journal of High Performance Computing Applications 14, 317–329 (2000)

    Article  Google Scholar 

  7. DeRose, L., Ekanadham, K., Hollingsworth, J.K., Sbaraglia, S.: SIGMA: A Simulator Infrastructure to Guide Memory Analysis. In: Proceedings of SC 2002, Baltimore, MD (November 2002)

    Google Scholar 

  8. Dynaprof Homepage, http://www.cs.utk.edu/mucci/dynaprof

  9. Hsiao, H.C., King, C.T.: MICA: A Memory and Interconnect Simulation Environment for Cache-based Architectures. In: Proceedings of the 33rd IEEE Annual Simulation Symposium (SS 2000), April 2000, pp. 317–325 (2000)

    Google Scholar 

  10. Intel Corporation. IA-32 Intel Architecture: Software Developers Manual

    Google Scholar 

  11. Kowarschik, M., Rüde, U., Thürey, N., Weiß, C.: Performance Optimization of 3DMultigrid on Hierarchical Memory Architectures. In: Fagerholm, J., Haataja, J., Järvinen, J., Lyly, M., Råback, P., Savolainen, V. (eds.) PARA 2002. LNCS, vol. 2367, pp. 307–316. Springer, Heidelberg (2002)

    Google Scholar 

  12. Kowarschik, M., Weiß, C.: An Overview of Cache Optimization Techniques and Cache- Aware Numerical Algorithms. In: Meyer, U., Sanders, P., Sibeyn, J.F. (eds.) Algorithms for Memory Hierarchies. LNCS, vol. 2625, pp. 213–232. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  13. Levon, J.: OProfile, a system-wide profiler for Linux systems, Homepage: http://oprofile.sourceforge.net

  14. Martonosi, M., Gupta, A., Anderson, T.E.: Memspy:Analyzingmemory system bottlenecks in programs. In: Measurement and Modeling of Computer Systems, pp. 1–12 (1992)

    Google Scholar 

  15. Nethercote, N., Seward, J.: Valgrind: A Program Supervision Framework. In: Proceedings of the Third Workshop on Runtime Verification (RV 2003), Boulder, Colorado, USA (July 2003), Available at http://developer.kde.org/~sewardj

  16. Pai, V.S., Ranganathan, P., Adve, S.V., Harton, T.: An Evaluation of Memory Consistency Models for Shared-Memory Systems with ILP Processors. In: Proceedings of the Seventh International Conference on Architectural Support for Programming Languages and Operating Systems, October 1996, pp. 12–23 (1996)

    Google Scholar 

  17. Thürey, N.: Cache Optimizations for Multigrid in 3D. Lehrstuhl für Informatik 10 (Systemsimulation), Institut für Informatik, University of Erlangen-Nuremberg, Germany (June 2002) Studienarbeit

    Google Scholar 

  18. Weidendorfer, J., Kowarschik, M., Trinitis, C.: A Tool Suite for Simulation Based Analysis of Memory Access Behavior. In: Proceedings of International Conference on Computational Science, June 2004, vol. 3, pp. 455–462 (2004)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Weidendorfer, J., Trinitis, C. (2006). Cache Optimizations for Iterative Numerical Codes Aware of Hardware Prefetching. In: Dongarra, J., Madsen, K., Waśniewski, J. (eds) Applied Parallel Computing. State of the Art in Scientific Computing. PARA 2004. Lecture Notes in Computer Science, vol 3732. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11558958_111

Download citation

  • DOI: https://doi.org/10.1007/11558958_111

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-29067-4

  • Online ISBN: 978-3-540-33498-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics