Skip to main content
Log in

Loop transformations to prevent false sharing

  • Published:
International Journal of Parallel Programming Aims and scope Submit manuscript

Abstract

To date, page management in shared virtual memory (SVM) systems has been primarily the responsibility of the run-time system. However, there are some problems that are difficult to resolve efficiently at run time. Chief among these is false sharing. In this paper, a loop transformation theory is developed for identifying and eliminating potential sources of multiple-writer false sharing and other sources of page migration resulting from regular references in numerical applications. Loop nests of one and two dimensions (before blocking) with single-level, DOALL-style parallelism are covered. The potential of these transformations is demonstrated experimentally.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Lai Li, Shared Virtual Memory on Loosely Coupled Multiprocessors. Ph.D. thesis, Yale University (September 1986).

  2. Z. Lajormi and T. Priol, KOAN: A Shared-Memory for the iPSC/2 Hypercube. InCONPAR/VAPP92, LICS 634. Springer-Verlag (September 1992).

  3. Elana D. Granston and Harry A. G. Wishoff, Managing Pages in Shared Virtual Memory Systems: Getting the Compiler into the Game. Technical Report 92-19, Computer Science Department, Leiden University (December 1992). Revised July 1993.

  4. Thierry Montaut, Méthodes pour l'élimination du faux-partage et l'optimisation de la localité pour mémoire virtuelle partagée. Ph.D. thesis, IRISA, Campus de Beaulieu (1995). In preparation.

    Google Scholar 

  5. François Bodin, Elana D. Granston, and Thierry Montaut, Page-level Affinity Scheduling for Eliminating False Sharing, inFifth Workshop on Compiler for Parallel Computers, Malaga, Spain (June 1995).

  6. François Bodin, Elana D. Granston, and Thierry Montaut, Evaluating Two Loop Transformations for Reducing Multiple-Writer False Sharing. In theSeventh Annual Workshop on Languages and Compilers for Parallel Computing, Ithaca, New York (August 1994). Published asLNCS 892, 324–439, Pingaliet al. (eds.), (1995). Springer-Verlag, Berlin, Heidelberg.

    Google Scholar 

  7. F. Bodin, L. Kervella, and T. Priol, Fortran-S: A Fortran Interface for Shared Virtual Memory Architectures.Supercomputing '93, pp. 274–283. IEEE Computer Society Press (November 1993).

  8. L. M. Censier and P. Feautrier, A New Solution to Coherence Problems in Multicache Systems.IEEE Transactions on Computers, pp. 1112–1118 (December 1978).

  9. J. Dongarra, J. Bunch, C. Moler, and G. Stewart,LINPACK User's Guide (1979).

  10. William J. Bolosky, Robert P. Fitzgerald, and Michael L. Scott, Simple But Effective Techniques for NUMA Memory Management. InProc. of the 12th ACM Symp. on Operating Systems Principles, ACM Press, pp. 19–31 (December 1989).

  11. Susan J. Eggers and Tor E. Jeremiassen, Eliminating False Sharing. InProc. of the Int'l. Conf. on Parallel Processing, CRC Press, Inc., pp. 377–381 (August 1991).

  12. Josep Torrellas, Monica S. Lam, and John L. Hennessy, False Sharing and Spatial Locality in Multiprocessor Caches (August 1992). Submitted toIEEE Transactions on Computers.

  13. Jennifer Anderson and Monica Lam, Global Optimizations for Parallelism and Locality on Scalable Parallel Machines. InProc. of the SIGPLAN '93 Conf. on Programming Languages Design and Implementation. ACM Press (June 1993).

  14. Saman P. Ammarsinghe, Jennifer M. Anderson, Monica S. Lam, and Chau-Wen Tseng, Design and Evaluation of Computer Optimizations for Scalable Address Space Machines (to be published).

  15. François Bodin, Christine Eisenbeis, William Jalby and Daniel Windheiser, A Quantitative Algorithm for Data Locality Optimization,Code Generation-Concepts, Tools, Techniques, Springer-Verlag (1992).

  16. Ken Kennedy and Katheryn S. McKinley, Optimizing for Parallelism and Data Locality. InInt'l. Conf. on Supercomputing, ACM Press, pp. 323–334 (July 1992).

  17. Michael E. Wolf and Monica S. Lam, A Data Locality Optimizing Algorithm. InProc. of the SIGPLAN '91 Conf. on Programming Languages Design and Implementation, ACM Press, pp. 30–44 (June 1991).

  18. David E. Hudak and Santosh G. Abraham, Compiler Techniques for Data Partitioning of Sequentially Iterated Loops. InProc. of the Int'l. Conf. on Supercomputing, ACM Press, pp. 187–200 (June 1990).

  19. Bill Appelbe, Charles Hardnett, and Sri Doddapaneni, Program Transformation for Locality Using Affinity Regions. InThe Sixth Annual Workshop on Languages and Compilers for Parallel Computing, Portland, Oregon (August 1993). Published inLanguages and Compilers for Parallel Computing, Banerjeeet al. (eds.),LNCS 768, pp. 290–300, Springer-Verlag (1994).

  20. P. Keleher, S. Dwarkadas, A. Cox, and W. Zwaenepoel, Treadmarks: Distributed Shared Memory on Standard Workstations and Operating Systems. InWinter Usenix Conference (1994).

  21. Mauricio Breternitz, Jr., Michael Lai, Vivek Sarkar, and Barbara Simons, Compiler Solutions for the Stale-Data and False-Sharing Problems. Technical Report 03.466, IBM Santa Teresa Laboratory (April 1993).

  22. Ravi Michandaney, Seema Hiranandani, and Ajay Sethi, Improving the Performance of DSM Systems via Compiler Involvement.Supercomputing (1994).

  23. M. F. P. O'Boyle, L. Kervella, and F. Bodin, Synchronization Minimization in a SPMD Execution Model. To appear in theJournal of Parallel and Distributed Computing.

  24. Charles H. Koelbel, David B. Loveman, Robert S. Schreiber, Guy L. Steele, Jr., and Mary E. Zosel,The High Performance Fortran Handbook, MIT Press, Cambridge, Massachusetts (1994).

    Google Scholar 

  25. Siddhartha Chatterjee, John R. Gilbert, Fred J. E. Long, Robert Schreiber, and Shun-Hua Teng, Generating Local Address Communication Sets for Data-Parallel Programs. InProc. of the Fourth ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, pp. 149–158, San Diego, California (1993).

  26. Corinne Ancourt, Fabien Coelho, François Irigoin, and Ronan Keryell, A Linear Algebra Framework for Static HPF Code Distribution. InProc. of the Fourth Workshop on Compilers for Parallel Computers, Delft, The Netherlands (December 1993).

  27. Ken Kennedy, Nenad Nedeljkovic, and Ajay Sethi, Efficient Address Generation for Block-Cyclic Distributions. Technical report, Center for Research on Parallel Computation, Rice University, Technical Report No. CRPC-TR94487-S, Houston, Texas (December 1994).

  28. A. André, M. Le Fur, Y. Mahéo, and J.-L. Pazat, The Pandore Data Parallel Compiler and its Portable Runtime. In HPCN Europe '95, Milan, Italy (May 1995). To appear inLNCS, Springer Verlag.

  29. Chau-Wen Tseng, Seema Hiranandani, and Ken Kennedy, Preliminary Experiences with the Fortran D Compiler.Supercomputing '93, IEEE Computer Society Press, pp. 338–350 (November 1993).

  30. Mary W. Hall, Seema Hiranandani, Ken Kennedy, and Chau-Wen Tseng, Interprocedural Compilation of Fortran D for MIMD Distributed Memory Machines. InSupercomputing '92, IEEE Computer Society Press, pp. 522–524 (November 1992).

Download references

Author information

Authors and Affiliations

Authors

Additional information

Supported by a Postdoctoral Research Associateship in Computational Science and Engineering under National Science Foundation Grant No. CDA-9310307, and by the Center for Research on Parallel Computation under Grant No. CCR-9120008.

Supported by the Esprit Agency XIII under Grant No. APPARC 6634 BRA III and Intel SSD under Grant No. 1 92 C 250 00 31318 01 2.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Granston, E.D., Montaut, T. & Bodin, F. Loop transformations to prevent false sharing. Int J Parallel Prog 23, 263–301 (1995). https://doi.org/10.1007/BF02577768

Download citation

  • Received:

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF02577768

Key words

Navigation