Abstract
We present two novel optimizations for compiling High Per- formance Fortran (HPF) to page-based software distributed shared mem- ory systems (SDSM). One technique, compiler-managed restricted con- sistency, uses compiler-derived knowledge to delay the application of memory consistency operations to data that is provably not shared in the current synchronization interval, thus reducing false sharing 1 . The other technique, compiler-managed shared buffers, when combined with the previous optimization, eliminates fragmentation 2 . Together, thetwo techniques permit compiler-generated code to effciently apply multi- dimensional computation partitioning and wavefront parallelism to exe- cute efficiently on SDSM systems.
False sharing occurs when two or more processors each accesses mutually disjoint sets of data elements in the same block.
Fragmentation occurs when an entire block of data is communicated to transport only a small fraction its content.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
S. Adve, A. L. Cox, S. Dwarkadas, R. Rajamony, and W. Zwaenepoel. A comparison of entry consistency and lazy release consistency implementations. In Proceedings of the Second High Performance Computer Architecture Symposium, pages 26–37, Feb. 1996.
V. Adve, G. Jin, J. Mellor-Crummey, and Q. Yi. High Performance Fortran Compilation Techniques for Parallelizing Scientific Codes. In Proceedings of SC98: High Performance Computing and Networking, Orlando, FL, Nov 1998.
V. Adve and J. Mellor-Crummey. Using Integer Sets for Data-Parallel Program Analysis and Optimization. In Proceedings of the SIGPLAN ’98 Conference on Programming Language Design and Implementation, Montreal, Canada, June 1998.
C. Amza, A. Cox, S. Dwarkadas, P. Keleher, H. Lu, R. Rajamony, W. Yu,and W. Zwaenepoel. TreadMarks: Shared memory computing on networks of workstations. IEEE Computer, 29(2):18–28, Feb. 1996.
D. Bailey, T. Harris, W. Saphir, R. van der Wijngaart, A. Woo, and M. Yarrow. The NAS parallel benchmarks 2.0. Technical Report NAS-95-020, NASA Ames Research Center, Dec. 1995.
B. Bershad, M. Zekauskas, and W. Sawdon. The Midway distributed shared memory system. In Proceedings of the ’93 CompCon Conference, pages 528–537, Feb. 1993.
S. Chandra and J. Larus. Optimizing communication in HPF programs on fine-grain distributed shared memory. In Proceedings of the 6th Symposium on the Principles and Practice of Parallel Programming, pages 100–111, June 1997.
K. G. Daniel J. Scales and C. Thekkath. Shasta: A low overhead, software-only approach for supporting finegrain shared memory. In Proceedings of the Seventh International Conference on Architectural Support for Programming Languages and Operating Systems, pages 174–185, Oct. 1996.
S. Dwarkadas, A. Cox, and W. Zwaenepoel. An integrated compile-time/run-time software distributed shared memory system. In Proceedings of the 7th Symposium on Architectural Support for Programming Languages and Operating Systems, pages 186–197, Oct. 1996.
S. Dwarkadas, P. Keleher, A. L. Cox, and W. Zwaenepoel. Evaluation of release consistent software distributed shared memory on emerging network technology. In Proceedings of the 20th Annual International Symposium on Computer Architecture, pages 244–255, May 1993.
H. Han and C. Tseng. Compile-time synchronization optimizations for software dsms. In Proceedings of the 12th International Paral lel Processing Symposium, Apr. 1998.
H. Han, C.-W. Tseng, and P. Keleher. Eliminating barrier synchronization for compiler-parallelized codes on software DSMs. International Journal of Parallel Programming, 26(5):591–612, Oct. 1998. Invited paper from LCPC’97.
A. R. L. I. Schoinas, B. Falsafi, S. K. Reinhardt, J. R. Larus, and D. A. Wood. Fine-grain access control for distributed shared memory. In Proceedings of the Sixth International Conference on Architectural Support for Programming Languages and Operating Systems, pages 297–306, Oct. 1994.
S. D. John Heinlein, Kourosh Gharachorloo and A. Gupta. Integration of message passing and shared memory in the stanford ash multiprocessor. In Proceedings of the Sixth International Conference on Architectural Support for Programming Languages and Operating Systems, pages 38–50, Oct. 1994.
K. Johnson, M. Kaashoek, and D. Wallach. CRL: High-performance all-software distributed shared memory. In Proceedings of the 15th ACM Symposium on Oper-ating Systems Principles, pages 213–228, Dec. 1995.
P. Keleher, A. L. Cox, S. Dwarkadas, and W. Zwaenepoel. An evaluation of software-based release consistent protocols. Journal of Parallel and Distributed Computing, 29:126–141, October 1995.
P. Keleher and C. Tseng. Enhancing software DSM for compiler-parallelized applications. In Proceedings of the 11th International Paral lel Processing Symposium, Apr. 1997.
P. Koch, R. Fowler, and E. Jul. Message-driven relaxed consistency in a software distributed shared memory. In Proceedings of the First USENIX Symposium on Operating System Design and Implementation, pages 75–86, Nov. 1994.
B. Lu and J. Mellor-Crummey. Compiler optimization of implicit reductions for distributed memory multiprocessors. In Proceedings of the 12th International Parallel Processing Symposium, Orlando, FL, Mar. 1998.
H. Lu, A. Cox, S. Dwarkadas, R. Rajamony, and W. Zwaenepoel. Software distributed shared memory support for irregular applications. In Proceedings of the 6th Symposium on the Principles and Practice of Parallel Programming, pages 48–56, June 1996.
R. Mirchandaney, S. Hiranandani, and A. Sethi. Improving the performance of software DSM systems via compiler involvement. In Proceedings of Supercomputing ’94, 1994.
R. W. P. S. K. Reinhardt and D. A. Wood. Decoupled hardware support for distributed shared memory. In Proceedings of the 23rd Annual International Symposium on Computer Architecture, pages 34–43, May 1996.
I. Schoinas, B. Falsafi, A. R. Lebeck, S. K. Reinhardt, J. R. Larus, and D. A. Wood. Fine-grain access control for distributed shared memory. In Proceedings of the 6th Symposium on Architectural Support for Programming Languages and Operating Systems, pages 297–306, Oct. 1994.
K. Zhang. Compiling for software distributed-shared memory systems. Master’s thesis, Dept. of Computer Science, Rice University, Houston, TX, Apr. 2000.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2000 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Zhang, K., Mellor-Crummey, J., Fowler, R.J. (2000). Compilation and Runtime Optimizations for Software Distributed Shared Memory. In: Dwarkadas, S. (eds) Languages, Compilers, and Run-Time Systems for Scalable Computers. LCR 2000. Lecture Notes in Computer Science, vol 1915. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-40889-4_14
Download citation
DOI: https://doi.org/10.1007/3-540-40889-4_14
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-41185-7
Online ISBN: 978-3-540-40889-5
eBook Packages: Springer Book Archive