Skip to main content

Compilation and Runtime Optimizations for Software Distributed Shared Memory

  • Conference paper
  • First Online:
Book cover Languages, Compilers, and Run-Time Systems for Scalable Computers (LCR 2000)

Abstract

We present two novel optimizations for compiling High Per- formance Fortran (HPF) to page-based software distributed shared mem- ory systems (SDSM). One technique, compiler-managed restricted con- sistency, uses compiler-derived knowledge to delay the application of memory consistency operations to data that is provably not shared in the current synchronization interval, thus reducing false sharing 1 . The other technique, compiler-managed shared buffers, when combined with the previous optimization, eliminates fragmentation 2 . Together, thetwo techniques permit compiler-generated code to effciently apply multi- dimensional computation partitioning and wavefront parallelism to exe- cute efficiently on SDSM systems.

False sharing occurs when two or more processors each accesses mutually disjoint sets of data elements in the same block.

Fragmentation occurs when an entire block of data is communicated to transport only a small fraction its content.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. S. Adve, A. L. Cox, S. Dwarkadas, R. Rajamony, and W. Zwaenepoel. A comparison of entry consistency and lazy release consistency implementations. In Proceedings of the Second High Performance Computer Architecture Symposium, pages 26–37, Feb. 1996.

    Google Scholar 

  2. V. Adve, G. Jin, J. Mellor-Crummey, and Q. Yi. High Performance Fortran Compilation Techniques for Parallelizing Scientific Codes. In Proceedings of SC98: High Performance Computing and Networking, Orlando, FL, Nov 1998.

    Google Scholar 

  3. V. Adve and J. Mellor-Crummey. Using Integer Sets for Data-Parallel Program Analysis and Optimization. In Proceedings of the SIGPLAN ’98 Conference on Programming Language Design and Implementation, Montreal, Canada, June 1998.

    Google Scholar 

  4. C. Amza, A. Cox, S. Dwarkadas, P. Keleher, H. Lu, R. Rajamony, W. Yu,and W. Zwaenepoel. TreadMarks: Shared memory computing on networks of workstations. IEEE Computer, 29(2):18–28, Feb. 1996.

    Google Scholar 

  5. D. Bailey, T. Harris, W. Saphir, R. van der Wijngaart, A. Woo, and M. Yarrow. The NAS parallel benchmarks 2.0. Technical Report NAS-95-020, NASA Ames Research Center, Dec. 1995.

    Google Scholar 

  6. B. Bershad, M. Zekauskas, and W. Sawdon. The Midway distributed shared memory system. In Proceedings of the ’93 CompCon Conference, pages 528–537, Feb. 1993.

    Google Scholar 

  7. S. Chandra and J. Larus. Optimizing communication in HPF programs on fine-grain distributed shared memory. In Proceedings of the 6th Symposium on the Principles and Practice of Parallel Programming, pages 100–111, June 1997.

    Google Scholar 

  8. K. G. Daniel J. Scales and C. Thekkath. Shasta: A low overhead, software-only approach for supporting finegrain shared memory. In Proceedings of the Seventh International Conference on Architectural Support for Programming Languages and Operating Systems, pages 174–185, Oct. 1996.

    Google Scholar 

  9. S. Dwarkadas, A. Cox, and W. Zwaenepoel. An integrated compile-time/run-time software distributed shared memory system. In Proceedings of the 7th Symposium on Architectural Support for Programming Languages and Operating Systems, pages 186–197, Oct. 1996.

    Google Scholar 

  10. S. Dwarkadas, P. Keleher, A. L. Cox, and W. Zwaenepoel. Evaluation of release consistent software distributed shared memory on emerging network technology. In Proceedings of the 20th Annual International Symposium on Computer Architecture, pages 244–255, May 1993.

    Google Scholar 

  11. H. Han and C. Tseng. Compile-time synchronization optimizations for software dsms. In Proceedings of the 12th International Paral lel Processing Symposium, Apr. 1998.

    Google Scholar 

  12. H. Han, C.-W. Tseng, and P. Keleher. Eliminating barrier synchronization for compiler-parallelized codes on software DSMs. International Journal of Parallel Programming, 26(5):591–612, Oct. 1998. Invited paper from LCPC’97.

    Google Scholar 

  13. A. R. L. I. Schoinas, B. Falsafi, S. K. Reinhardt, J. R. Larus, and D. A. Wood. Fine-grain access control for distributed shared memory. In Proceedings of the Sixth International Conference on Architectural Support for Programming Languages and Operating Systems, pages 297–306, Oct. 1994.

    Google Scholar 

  14. S. D. John Heinlein, Kourosh Gharachorloo and A. Gupta. Integration of message passing and shared memory in the stanford ash multiprocessor. In Proceedings of the Sixth International Conference on Architectural Support for Programming Languages and Operating Systems, pages 38–50, Oct. 1994.

    Google Scholar 

  15. K. Johnson, M. Kaashoek, and D. Wallach. CRL: High-performance all-software distributed shared memory. In Proceedings of the 15th ACM Symposium on Oper-ating Systems Principles, pages 213–228, Dec. 1995.

    Google Scholar 

  16. P. Keleher, A. L. Cox, S. Dwarkadas, and W. Zwaenepoel. An evaluation of software-based release consistent protocols. Journal of Parallel and Distributed Computing, 29:126–141, October 1995.

    Google Scholar 

  17. P. Keleher and C. Tseng. Enhancing software DSM for compiler-parallelized applications. In Proceedings of the 11th International Paral lel Processing Symposium, Apr. 1997.

    Google Scholar 

  18. P. Koch, R. Fowler, and E. Jul. Message-driven relaxed consistency in a software distributed shared memory. In Proceedings of the First USENIX Symposium on Operating System Design and Implementation, pages 75–86, Nov. 1994.

    Google Scholar 

  19. B. Lu and J. Mellor-Crummey. Compiler optimization of implicit reductions for distributed memory multiprocessors. In Proceedings of the 12th International Parallel Processing Symposium, Orlando, FL, Mar. 1998.

    Google Scholar 

  20. H. Lu, A. Cox, S. Dwarkadas, R. Rajamony, and W. Zwaenepoel. Software distributed shared memory support for irregular applications. In Proceedings of the 6th Symposium on the Principles and Practice of Parallel Programming, pages 48–56, June 1996.

    Google Scholar 

  21. R. Mirchandaney, S. Hiranandani, and A. Sethi. Improving the performance of software DSM systems via compiler involvement. In Proceedings of Supercomputing ’94, 1994.

    Google Scholar 

  22. R. W. P. S. K. Reinhardt and D. A. Wood. Decoupled hardware support for distributed shared memory. In Proceedings of the 23rd Annual International Symposium on Computer Architecture, pages 34–43, May 1996.

    Google Scholar 

  23. I. Schoinas, B. Falsafi, A. R. Lebeck, S. K. Reinhardt, J. R. Larus, and D. A. Wood. Fine-grain access control for distributed shared memory. In Proceedings of the 6th Symposium on Architectural Support for Programming Languages and Operating Systems, pages 297–306, Oct. 1994.

    Google Scholar 

  24. K. Zhang. Compiling for software distributed-shared memory systems. Master’s thesis, Dept. of Computer Science, Rice University, Houston, TX, Apr. 2000.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2000 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Zhang, K., Mellor-Crummey, J., Fowler, R.J. (2000). Compilation and Runtime Optimizations for Software Distributed Shared Memory. In: Dwarkadas, S. (eds) Languages, Compilers, and Run-Time Systems for Scalable Computers. LCR 2000. Lecture Notes in Computer Science, vol 1915. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-40889-4_14

Download citation

  • DOI: https://doi.org/10.1007/3-540-40889-4_14

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-41185-7

  • Online ISBN: 978-3-540-40889-5

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics