Abstract
Shared virtual memory (SVM) simplifies the programming of parallel systems with memory hierarchies and physically distributed address spaces, by providing the illusion of a flat global address space where coherency is maintained at the page level. The success of the SVM abstraction depends on efficient page management, which in turn depends on the efficient handling of false sharing and the resulting ping-pong effects that it can cause. We evaluate two loop transformations for attacking this problem. The first is a simple, new technique for reducing the ping-pong effects that result from multiple-writer false sharing. The second is our previously-proposed technique for eliminating multiple-writer false sharing itself. Both have been implemented in the Fortran-S compiler, which generates code that runs on the iPSC/2 under the KOAN SVM. Preliminary performance results are presented.
Preview
Unable to display preview. Download preview PDF.
References
K. Li, Shared Virtual Memory on Loosely Coupled Multiprocessors. PhD thesis, Yale University, Sept. 1986.
Z. Lajormi and T. Priol, “KOAN: A Shared-Memory for the iPSC/2 Hypercube,” in CONPAR/VAPP92, LNCS 634, Springer-Verlag, Sept. 1992.
T. Montaut and F. Bodin, “False Sharing in Shared Virtual Memory: Analysis and Optimization,” tech. rep., IRISA, 1993.
F. Bodin, E. D. Granston, and T. Montaut, “Experiences Reducing False Sharing in Shared Virtual Memory Systems.” Submitted for publication.
E. D. Granston, “Toward a Compile-Time Methodology for Reducing False Sharing and Communication Traffic in Shared Virtual Memory Systems,” in the Sixth Annual Workshop on Languages and Compilers for Parallel Computing, Aug. 1993.
F. Bodin, L. Kervella, and T. Priol, “Fortran-S: A Fortran Interface for Shared Virtual Memory Architectures,” in Supercomputing '93, pp. 274–283, Nov. 1993.
L. Censier and P. Feautrier, “A New Solution to Coherence Problems in Multicache Systems,” IEEE Trans. on Computers, pp. 1112–1118, Dec. 1978.
J. Dongarra, J. Bunch, C. Moler, and G. Stewart, LINPACK User's Guide, 1979.
W. J. Bolosky, R. P. Fitzgerald, and M. L. Scott, “Simple But Effective Techniques for NUMA Memory Management,” in ACM Symp. on Operating Systems Principles, pp. 19–31, Dec. 1989.
J. Torrellas, M. S. Lam, and J. L. Hennessy, “False Sharing and Spatial Locality in Multiprocessor Caches,” Aug. 1992. Submitted to IEEE Trans. on Computers.
S. P. Ammarsinghe, J. M. Anderson, M. S. Lam, and C.-W. Tseng, “Design and Evaluation of Compiler Optimizations for Scalable Address Space Machines,” 1994. To be published.
S. J. Eggers and T. E. Jeremiassen, “Eliminating false sharing,” in Int. Conf. on Parallel Processing, pp. 377–381, Aug. 1991.
F. Bodin, C. Eisenbeis, W. Jalby, and D. Windheiser, “A Quantitative Algorithm for Data Locality Optimization,” in Code Generation-Concepts, Tools, Techniques, Springer-Verlag, 1992.
K. Kennedy and K. S. McKinley, “Optimizing for Parallelism and Data Locality,” in Int. Conf. on Supercomputing, pp. 323–334, July 1992.
M. E. Wolf and M. S. Lam, “A Data Locality Optimizing Algorithm,” in SIGPLAN '91 Conf. on Programming Languages Design and Implementation, pp. 30–44, June 1991.
J. Fang and M. Lu, “A Solution to the Cache Ping-Pong Problem in RISC Based Parallel Processing Systems,” in Int. Conf. on Parallel Processing, Aug. 1991.
B. Appelbe, C. Hardnett, and S. Doddapaneni, “Program Transformation for Locality Using Affinity Regions,” in the Sixth Annual Workshop on Languages and Compilers for Parallel Computing, Aug. 1993.
J. Anderson and M. Lam, “Global Optimizations for Parallelism and Locality on Scalable Parallel Machines,” in SIGPLAN '93 Conf. on Programming Languages Design and Implementation, June 1993.
P. Keleher, S. Dwarkadas, A. Cox, and W. Zwaenepoel, “Treadmarks: Distributed Shared Memory On Standard Workstations and and Operating Systems,” in Winter Usenix Conf., 1994.
M. Breternitz, Jr., M. Lai, V. Sarkar, and B. Simons, “Compiler Solutions for the Stale-Data and False-Sharing Problems,” Tech. Rep. 03.466, IBM Santa Teresa Laboratory, Apr. 1993.
R. Michandaney, S. Hiranandani, and A. Sethi, “Improving the Performance of DSM Systems via Compiler Involvement,” in Supercomputing '94, 1994.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1995 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Bodin, F., Granston, E.D., Montaut, T. (1995). Evaluating two loop transformations for reducing multiple-writer false sharing. In: Pingali, K., Banerjee, U., Gelernter, D., Nicolau, A., Padua, D. (eds) Languages and Compilers for Parallel Computing. LCPC 1994. Lecture Notes in Computer Science, vol 892. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0025894
Download citation
DOI: https://doi.org/10.1007/BFb0025894
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-58868-9
Online ISBN: 978-3-540-49134-7
eBook Packages: Springer Book Archive