Skip to main content

A Framework for Global Communication Analysis and Optimizations

  • Chapter
  • First Online:
Compiler Optimizations for Scalable Parallel Systems

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1808))

  • 470 Accesses

Abstract

Distributed memory architectures have become popular as a viable and cost-effective method of building scalable parallel computers. However, the absence of global address space, and consequently, the need for explicit message passing among processes makes these machines very difficult to program. This has motivated the design of languages like High Performance Fortran [14], which allow the programmer to write sequential or shared-memory parallel programs that are annotated with directives specifying data decomposition. The compilers for these languages are responsible for partitioning the computation, and generating the communication necessary to fetch values of nonlocal data referenced by a processor. A number of such prototype compilers have been developed [3, 6, 19, 23, 29, 30, 33, 34, 43].

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. A. V. Aho, R. Sethi, and J. D. Ullman. Compilers: principles, techniques, and tools. Addison-Wesley, 1986.

    Google Scholar 

  2. F. E. Allen and J. Cocke. A program data flow analysis procedure. Communications of the ACM, 19(3):137–147, March 1976.

    Article  MATH  Google Scholar 

  3. S. P. Amarasinghe and M. S. Lam. Communication optimization and code generation for distributed memory machines. In Proc. ACM SIGPLAN’ 93 Conference on Programming Language Design and Implementation, Albuquerque, New Mexico, June 1993.

    Google Scholar 

  4. V. Balasundaram. A mechanism for keeping useful internal information in parallel programming tools: the data access descriptor. Journal of Parallel and Distributed Computing, 9(2):154–170, June 1990.

    Article  Google Scholar 

  5. V. Balasundaram, G. Fox, K. Kennedy, and U. Kremer. A static performance estimator to guide data partitioning decisions. In Proc. Third ACM SIGPLAN Symposium on Principles and Practices of Parallel Programming, Williamsburg, VA, April 1991.

    Google Scholar 

  6. P. Banerjee, J. Chandy, M. Gupta, E. Hodges, J. Holm, A. Lain, D. Palermo, S. Ramaswamy, and E. Su. The PARADIGM compiler for distributed-memory multicomputers. IEEE Computer, October 1995.

    Google Scholar 

  7. M. Burke. An interval-based approach to exhaustive and incremental interprocedural data-flow analysis. ACM Transactions on Programming Languages and Systems, 12(3):341–395, July 1990.

    Article  Google Scholar 

  8. D. Callahan and K. Kennedy. Analysis of interprocedural side effects in a parallel programming environment. In Proc. First International Conference on Supercomputing, Athens, Greece, 1987.

    Google Scholar 

  9. D. Callahan and K. Kennedy. Analysis of interprocedural side effects in a parallel programming environment. Journal of Parallel and Distributed Computing, 5:517–550, 1988.

    Article  Google Scholar 

  10. S. Chakrabarti, M. Gupta, and J.-D. Choi. Global communication analysis and optimization. In Proc. ACM SIGPLAN Conference on Programming Language Design and Implementation, Philadelphia, PA, May 1996.

    Google Scholar 

  11. S. Chatterjee, J. R. Gilbert, R. Schreiber, and S.-H. Teng. Optimal evaluation of array expressions on massively parallel machines. In Proc. Second Workshop on Languages, Compilers, and Runtime Environments for Distributed Memory Multiprocessors, Boulder, CO, October 1992.

    Google Scholar 

  12. F. C. Chow. A portable machine-independent global optimizer — design and measurements. PhD thesis, Computer Systems Laboratory, Stanford University, December 1983.

    Google Scholar 

  13. D. M. Dhamdhere, B. K. Rosen, and F. K. Zadeck. How to analyze large programs efficiently and informatively. In Proc. ACM SIGPLAN’ 92 Conference on Programming Language Design and Implementation, San Francisco, CA, June 1992.

    Google Scholar 

  14. High Performance Fortran Forum. High Performance Fortran language specification, version 1.0. Technical Report CRPC-TR92225, Rice University, May 1993.

    Google Scholar 

  15. C. Gong, R. Gupta, and R. Melhem. Compilation techniques for optimizing communication in distributed-memory systems. In Proc. 1993 International Conference on Parallel Processing, St. Charles, IL, August 1993.

    Google Scholar 

  16. E. Granston and A. Veidenbaum. Detecting redundant accesses to array data. In Proc. Supercomputing’ 91, pages 854–965, 1991.

    Google Scholar 

  17. T. Gross and P. Steenkiste. Structured dataflow analysis for arrays and its use in an optimizing compiler. Software-Practice and Experience, 20(2):133–155, February 1990.

    Article  Google Scholar 

  18. M. Gupta and P. Banerjee. A methodology for high-level synthesis of communication on multicomputers. In Proc. 6th ACM International Conference on Supercomputing, Washington D.C., July 1992.

    Google Scholar 

  19. M. Gupta, S. Midkiff, E. Schonberg, V. Seshadri, K.Y. Wang, D. Shields, W.-M. Ching, and T. Ngo. An HPF compiler for the IBM SP2. In Proc. Supercomputing’ 95, San Diego, CA, December 1995.

    Google Scholar 

  20. M. Gupta and E. Schonberg. Static analysis to reduce synchronization costs in data-parallel programs. In Proc. 23rd Annual ACM Symposium on Principles of Programming Languages, St. Petersburg Beach, FL, January 1996.

    Google Scholar 

  21. M. Gupta, E. Schonberg, and H. Srinivasan. A unified framework for optimizing communication in data-parallel programs. IEEE Transactions on Parallel and Distributed Systems, 7(7), July 1996.

    Google Scholar 

  22. P. Havlak and K. Kennedy. An implementation of interprocedural bounded regular section analysis. IEEE Transactions on Parallel and Distributed Systems, 2(3):350–360, July 1991.

    Article  Google Scholar 

  23. S. Hiranandani, K. Kennedy, and C. Tseng. Compiling Fortran D for MIMD distributed-memory machines. Communications of the ACM, 35(8):66–80, August 1992.

    Article  Google Scholar 

  24. Jens Knoop and Oliver Rüthing and Bernhard Steffen. Lazy code motion. In Proc. ACM SIGPLAN’ 92 Conference on Programming Language Design and Implementation, San Francisco, CA, June 1992.

    Google Scholar 

  25. S. M. Joshi and D. M. Dhamdhere. A composite hoisting-strength reduction transformation for global program optimization (parts i and ii). International Journal of Computer Mathematics, pages 22–41, 111–126, 1992.

    Google Scholar 

  26. K. Kennedy and N. Nedeljkovic. Combining dependence and data-flow analyses to optimize communication. In Proc. 9th International Parallel Processing Symposium, Santa Barbara, CA, April 1995.

    Google Scholar 

  27. K. Kennedy and A. Sethi. Resource-based communication placement analysis. In Proc. Ninth Workshop on Languages and Compilers for Parallel Computing, San Jose, CA, August 1996.

    Google Scholar 

  28. C. Koelbel. Compiling programs for nonshared memory machines. PhD thesis, Purdue University, August 1990.

    Google Scholar 

  29. C. Koelbel and P. Mehrotra. Compiling global name-space parallel loops for distributed execution. IEEE Transactions on Parallel and Distributed Systems, 2(4):440–451, October 1991.

    Article  Google Scholar 

  30. J. Li and M. Chen. Compiling communication-efficient programs for massively parallel machines. IEEE Transactions on Parallel and Distributed Systems, 2(3):361–376, July 1991.

    Article  Google Scholar 

  31. E. Morel and C. Renvoise. Global optimization by suppression of partial redundancies. Communications of the ACM, 22(2):96–103, February 1979.

    Article  MATH  MathSciNet  Google Scholar 

  32. M. O’Boyle and F. Bodin. Compiler reduction of synchronization in shared virtual memory systems. In Proc. 9th ACM International Conference on Supercomputing, Barcelona, Spain, July 1995.

    Google Scholar 

  33. M.J. Quinn and P. J. Hatcher. Data-parallel programming on multicomputers. IEEE Software, 7:69–76, September 1990.

    Article  Google Scholar 

  34. A. Rogers and K. Pingali. Process decomposition through locality of reference. In Proc. SIGPLAN’ 89 Conference on Programming Language Design and Implementation, pages 69–80, June 1989.

    Google Scholar 

  35. C. Rosene. Incremental Dependence Analysis. PhD thesis, Rice University, March 1990.

    Google Scholar 

  36. Marc Snir et al. The communication software and parallel environment of the IBM SP2. IBM Systems Journal, 34(2):205–221, 1995.

    Article  Google Scholar 

  37. C. Stunkel et al. The SP2 high performance switch. IBM Systems Journal, 34(2):185–204, 1995.

    Google Scholar 

  38. R. E. Tarjan. Testing flow graph reducibility. Journal of Computer and System Sciences, 9(3):355–365, December 1974.

    MATH  MathSciNet  Google Scholar 

  39. C.-W. Tseng. Compiler optimizations for eliminating barrier synchronization. In Proc. 5th ACM Symposium on Principles and Practices of Parallel Programming, Santa Barbara, CA, July 1995.

    Google Scholar 

  40. R. v. Hanxleden and K. Kennedy. Give-n-take — a balanced code placement framework. In Proc. ACM SIGPLAN’ 94 Conference on Programming Language Design and Implementation, Orlando, Florida, June 1994.

    Google Scholar 

  41. R. v. Hanxleden, K. Kennedy, C. Koelbel, R. Das, and J. Saltz. Compiler analysis for irregular problems in Fortran D. In Proc. 5th Workshop on Languages and Compilers for Parallel Computing, New Haven, CT, August 1992.

    Google Scholar 

  42. M. Wolfe and U. Banerjee. Data dependence and its application to parallel processing. International Journal of Parallel Programming, 16(2):137–178, April 1987.

    Article  MATH  MathSciNet  Google Scholar 

  43. H. Zima, H. Bast, and M. Gerndt. SUPERB: A tool for semi-automatic MIMD/SIMD parallelization. Parallel Computing, 6:1–18, 1988.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2001 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Gupta, M. (2001). A Framework for Global Communication Analysis and Optimizations. In: Pande, S., Agrawal, D.P. (eds) Compiler Optimizations for Scalable Parallel Systems. Lecture Notes in Computer Science, vol 1808. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45403-9_14

Download citation

  • DOI: https://doi.org/10.1007/3-540-45403-9_14

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-41945-7

  • Online ISBN: 978-3-540-45403-8

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics