Skip to main content

A method for runtime recognition of collective communication on distributed-memory multiprocessors

  • VII Poster Session Papers
  • Conference paper
  • First Online:
  • 95 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1336))

Abstract

In this paper, we present a compiler optimization for recognizing patterns of collective communication at runtime in data-parallel languages that allow the dynamic data decomposition. It has a calculation time of the order O(m), and is appropriate for large numerical applications and massively parallel machines. The previous approach took O(n O + ... + n m−1) time, where m is the number of dimension of an array and n i is the array size on the i-th dimension. The new method can be used for data redistribution and intrinsic procedures, as well as data pre-fetch in parallelized loops.

This is a preview of subscription content, log in via an institution.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. William Gropp, Ewing Lusk, and Anthony Skjellum. Using MPI: portable parallel programming with the message-passing interface. The MIT Press, 1994.

    Google Scholar 

  2. Jingke Li and Marina Chen. Compiling communication-efficient programs for massively parallel machines. IEEE Trans. Parallel and Distributed Systems, 2(3):361–376, July 1991.

    Article  Google Scholar 

  3. Manish Gupta and Prithviraj Banerjee. A methodology for high-level synthesis of communication on multicomputers. In Procs. 6th ACM International Conference on Supercomputing, pages 357–367, Washington, D.C., July 1992.

    Google Scholar 

  4. Ravi Ponnusamy, Joel Saltz, and Alok Choudhary. Runtime compilation techniques for data artitioning and communication schedule reuse. In Procs. Supercomputing '93, pages 361–370, Portland, Oregon, November 1993.

    Google Scholar 

  5. Sanjay Ranka, Jhy-Chun Wang, and Manoj Kumar. Irregular personalized communication on distributed memory machines. Journal of Parallel and Distributed Computing, 25(1):58–71, 1995.

    Article  Google Scholar 

  6. Gagan Agrawal, Alan Sussman, and Joel Saltz. An integrated runtime and compile-time approach for parallelizing structured and block structured applications. Technical Report CS-TR3143, University of Maryland, Department of Computer Science, 94.

    Google Scholar 

  7. Rajev Thakur, Alok Choudhary, and Geoffrey Fox. Runtime array redistribution in HPF programs. In Procs. Scalable High Performance Computing Conference SHPCC-94, pages 309–316, Knoxville, Tennessee, May 1994.

    Google Scholar 

  8. S.D. Kaushik, C.-H. Huang, R.W. Johnson, and P.Sadayappan. Multi-phase array redistribution: Modeling and evaluation. In Procs. 9th International Parallel Processing Symposium, Santa Barbara, California, April 1995.

    Google Scholar 

  9. Shankar Ramaswamy and Prithviraj Banerjee. Automatic generation of efficient array redistribution routines for distributed memory multicomputers. In Procs. The 5th Symposium on the Frontiers of Massively Parallel Computation, pages 78–87, McLean, VA, February 1995.

    Google Scholar 

  10. Edgar T. Kalns and Lionel M. Ni. DaReL: A portable data redistribution library for distributedmemory machines. In Procs. Scalable Parallel Libraries Conference, pages 78–87, Mississippi State University, Mississippi, October 1994.

    Google Scholar 

  11. Ken Kennedy Seems, Hiranandani and Chau-Wen Tseng. Compiler optimizations for fortran D on MIMD distributed-memory machines. In Procs. Supercomputing '91, pages 86–100, Albuquerque, NM, November 1991.

    Google Scholar 

  12. Charles H. Koelbel, David B. Loveman, Robert S.Schreiber, Guy L.Steele Jr., and Mary E.Zosel. The High Performance Fortran Handbook. The MIT Press, 1994.

    Google Scholar 

  13. Samuel P. Midkiff. Local iteration set computation for block-cyclic distributions. In Procs. the 1995 International Conference on Parallel Processing, pages II/77–84, Boca Raton, FL, August 1995.

    Google Scholar 

  14. Hans Zima and Barbara Chapman. Super Compilers for Parallel and Vector Computers. ACM Press, 1990.

    Google Scholar 

  15. Kazuaki Ishizaki and Hideaki Komatsu. Loop Parallelization Algorithm for HPF Compiler. In Eighth Annual Workshop on Language and Compilers for Parallel Computing, pages 12.1–15, Ohio, August 1995.

    Google Scholar 

  16. Toshio Suganuma, Hideaki Komatsu, and Toshio Nakatani. Detection and global optimization of reduction operations for distributed parallel machines. In Procs. 10th ACM International Conference on Supercomputing, Philadelphia, Pennsylvania, USA, May 1996.

    Google Scholar 

  17. M.Gupta, S.Midkiff, E.Schonberg, P.Sweeney, and K.Y.Wang. PTRAN II: A compiler for high performance fortran. In Procs. 4th Workshop on Compilers for Parallel Computers, pages 479–493, Delft, Netherlands, December 1993.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Constantine Polychronopoulos Kazuki Joe Keijiro Araki Makoto Amamiya

Rights and permissions

Reprints and permissions

Copyright information

© 1997 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Ogasawara, T., Komatsu, H. (1997). A method for runtime recognition of collective communication on distributed-memory multiprocessors. In: Polychronopoulos, C., Joe, K., Araki, K., Amamiya, M. (eds) High Performance Computing. ISHPC 1997. Lecture Notes in Computer Science, vol 1336. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0024231

Download citation

  • DOI: https://doi.org/10.1007/BFb0024231

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-63766-0

  • Online ISBN: 978-3-540-69644-5

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics