Skip to main content

Improving Compiler and Run-Time Support for Irregular Reductions Using Local Writes

  • Conference paper
  • First Online:
Languages and Compilers for Parallel Computing (LCPC 1998)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1656))

Abstract

Current compilers for distributed-memory multiprocessors parallelize irregular reductions either by generating calls to sophisticated run-time systems (CHAOS) or by relying on replicated buffers and the shared-memory interface supported by software DSMs (TreadMarks). We introduce Local Write, a new technique for parallelizing irregular reductions based on the owner-computes rule. It eliminates the need for buffers or synchronized writes, but may replicate computation. We investigate the impact of connectivity (node/edge ratio), locality (accesses to local data) and adaptivity (edge modifications) on their relative performance. Local Write improves performance by 50–150% compared to using replicated buffers, and can match or exceed gather/scatter for applications with low locality or high adaptivity.

This research was supported by NSF CAREER Development Award #ASC9625531 in New Technologies. The IBM SP-2 and DEC Alpha Cluster were provided by NSF CISE Institutional Infrastructure Award #CDA9401151 and grants from IBM and DEC.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. T. Autrey and M. Wolfe. Initial results for glacial variable analysis. In D. Sehr, U. Banerjee, D. Gelernter, A. Nicolau, and D. Padua, editors, Languages and Compilers for Parallel Computing, Ninth International Workshop (LCPC’96), volume 1239 of Lecture Notes in Computer Science. Springer-Verlag, Santa Clara, CA, 1996.

    Google Scholar 

  2. S. Chandra and J. R. Larus. Optimizing communication in HPF programs for finegrain distributed shared memory. In Proceedings of the Sixth ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, Las Vegas, NV, June 1997.

    Google Scholar 

  3. A. Cox, S. Dwarkadas, H. Lu, and W. Zwaenepoel. Evaluating the performance of software distributed shared memory as a target for parallelizing compilers. In Proceedings of the 11th International Parallel Processing Symposium, Geneva, Switzerlan, April 1997.

    Google Scholar 

  4. R. Das, M. Uysal, J. Saltz, and Y.-S. Hwang. Communication optimizations for irregular scientific computations on distributed memory architectures. Journal of Parallel and Distributed Computing, 22(3):462–479, September 1994.

    Article  Google Scholar 

  5. S. Dwarkadas, A. Cox, and W. Zwaenepoel. An integrated compile-time-runtime software distributed shared memory system. In Proceedings of the Eighth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS-VIII), Boston, MA, October 1996.

    Google Scholar 

  6. M. Hall, S. Amarasinghe, B. Murphy, S. Liao, and M. Lam. Detecting coarsegrain parallelism using an interprocedural parallelizing compiler. In Proceedings of Supercomputing’ 95, San Diego, CA, December 1995.

    Google Scholar 

  7. H. Han and C.-W. Tseng. Compile-time synchronization optimizations for software DSMs. In Proceedings of the 12th International Parallel Processing Symposium, Orlando, FL, April 1998.

    Google Scholar 

  8. R. v. Hanxleden. Handling irregular problems with Fortran D — A preliminary report. In Proceedings of the Fourth Workshop on Compilers for Parallel Computers, Delft, The Netherlands, December 1993. 187, 188, 193, 194

    Google Scholar 

  9. R. v. Hanxleden and K. Kennedy. Give-N-Take — A balanced code placement framework. In Proceedings of the SIGPLAN’ 94 Conference on Programming Language Design and Implementation, Orlando, FL, June 1994.

    Google Scholar 

  10. S. Hiranandani, K. Kennedy, and C.-W. Tseng. Compiling Fortran D for MIMD distributed-memory machines. Communications of the ACM, 35(8):66–80, August 1992.

    Article  Google Scholar 

  11. S. Hiranandani, K. Kennedy, and C.-W. Tseng. Preliminary experiences with the Fortran D compiler. In Proceedings of Supercomputing’ 93, Portland, OR, November 1993.

    Google Scholar 

  12. Y.-S. Hwang, B. Moon, S. Sharma, R. Ponnusamy, R. Das, and J. Saltz. Runtime and language support for compiling adaptive irregular programs on distributed memory machines. Software—Practice and Experience, 25(6):597–621, June 1995.

    Article  Google Scholar 

  13. P. Keleher. Update protocols and iterative scientific applications. In Proceedings of the 12th International Parallel Processing Symposium, Orlando, FL, April 1998.

    Google Scholar 

  14. P. Keleher and C.-W. Tseng. Enhancing software DSM for compiler-parallelized applications. In Proceedings of the 11th International Parallel Processing Symposium, Geneva, Switzerland, April 1997.

    Google Scholar 

  15. A. Lain and P. Banerjee. Exploiting spatial regularity in irregular iterative applications. In Proceedings of the 9th International Parallel Processing Symposium, Santa Barbara, CA, April 1995.

    Google Scholar 

  16. B. Lu and J. Mellor-Crummey. Compiler optimization of implicit reductions for distributed memory multiprocessors. In Proceedings of the 12th International Parallel Processing Symposium, Orlando, FL, April 1998.

    Google Scholar 

  17. H. Lu, A. Cox, S. Dwarkadas, R. Rajamony, and W. Zwaenepoel. Compiler and software distributed shared memory support for irregular applications. In Proceedings of the Sixth ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, Las Vegas, NV, June 1997.

    Google Scholar 

  18. S. Mukherjee, S. Sharma, M. Hill, J. Larus, A. Rogers, and J. Saltz. Efficient support for irregular applications on distributed-memory machines. In Proceedings of the Fifth ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, Santa Barbara, CA, July 1995.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 1999 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Han, H., Tseng, CW. (1999). Improving Compiler and Run-Time Support for Irregular Reductions Using Local Writes. In: Chatterjee, S., et al. Languages and Compilers for Parallel Computing. LCPC 1998. Lecture Notes in Computer Science, vol 1656. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-48319-5_12

Download citation

  • DOI: https://doi.org/10.1007/3-540-48319-5_12

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-66426-0

  • Online ISBN: 978-3-540-48319-9

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics