Skip to main content

Coarse-grain Task Parallel Processing Using the OpenMP Backend of the OSCAR Multigrain Parallelizing Compiler

  • Conference paper
  • First Online:
High Performance Computing (ISHPC 2000)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1940))

Included in the following conference series:

Abstract

This paper describes automatic coarse grain parallel processing on a shared memory multiprocessor system using a newly developed OpenMP backend of OSCAR multigrain parallelizing compiler for from single chip multiprocessor to a high performance multiprocessor and a heterogeneous supercomputer cluster. OSCAR multigrain parallelizing compiler exploits coarse grain task parallelism and near fine grain parallelism in addition to traditional loop parallelism. The OpenMP backend generates parallelized Fortran code with OpenMP directives based on analyzed multigrain parallelism by middle path of OSCAR compiler from an ordinary Fortran source program. The performance of multigrain parallel processing function by OpenMP backend is evaluated on an off the shelf eight processor SMP machine, IBM RS6000. The evaluation shows that the multigrain parallel processing gives us more than 2 times speed up compared with a commercial loop parallelizing compiler, IBM XL Fortran compiler, on the SMP machine.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. S. Amarasinghe, J. Anderson, M. Lam, and C. Tseng. The suif compiler for scalable parallel machines. Proc. of the 7th SIAM conference on parallel processing for scientific computing, 1995.

    Google Scholar 

  2. J. M. Anderson, S. P. Amarasinghe, and M. S. Lam. Data and computation transformations for multiprocessors. Proceedings of the Fifth ACM SIGPLAN Symposium on Principles and Practice of Parallel Processing, Jul. 1995.

    Google Scholar 

  3. U. Banerjee. Loop parallelization. Kluwer Academic Pub., 1994.

    Google Scholar 

  4. U. Barnerjee. Dependence analysis for supercomputing. Kluwer Pub., 1989.

    Google Scholar 

  5. Carrie J. Brownhill, Alexandru Nicolau, Steve Novack, and Constantine D. Polychronopoulos. Achieving multi-level parallelization. Proc. of ISHPC’97, Nov. 1997.

    Google Scholar 

  6. Rudolf Eigenmann, Jay Hoeinger, and David Padua. On the automatic parallelization of the perfect benchmarks. IEEE Trans. on parallel and distributed systems, 9(1), Jan. 1998.

    Google Scholar 

  7. H. Kasahara et al. A multi-grain parallelizing compilation scheme on oscar. Proc. 4th Workshop on Languages and Compilers for Parallel Computing, Aug. 1991.

    Google Scholar 

  8. M. Girkar and C.D. Polychronopoulos. Optimization of data/control conditions in task graphs. Proc. 4th Workshop on Languages and Compilers for Parallel Computing, Aug. 1991.

    Google Scholar 

  9. Mohammad R. Haghighat and Constantine D. Polychronopoulos. Symbolic Analysis for Parallelizing Compliers. Kluwer Academic Publishers, 1995.

    Google Scholar 

  10. M. W. Hall, B. R. Murphy, S. P. Amarasinghe, S. Liao,, and M. S. Lam. Interprocedural parallelization analysis: A case study. Proceedings of the 8th International Workshop on Languages and Compilers for Parallel Computing (LCPC95), Aug. 1995.

    Google Scholar 

  11. Mary W. Hall, Jennifer M. Anderson, Saman P. Amarasinghe, Brian R. Murphy, Shih-Wei Liao, Edouard Bugnion, and Monica S. Lam. Maximizing multiprocessor performance with the suif compiler. IEEE Computer, 1996.

    Google Scholar 

  12. Hwansoo Han, Gabriel Rivera, and Chau-Wen Tseng. Software support for improving locality in scientific codes. 8th Workshop on Compilers for Parallel Computers (CPC’2000), Jan. 2000.

    Google Scholar 

  13. H. Honda, M. Iwata, and H. Kasahara. Coarse grain parallelism detection scheme of fortran programs. Trans. IEICE (in Japanese), J73-D-I(12), Dec. 1990.

    Google Scholar 

  14. IBM. XL Fortran for AIX Language Reference.

    Google Scholar 

  15. H. Kasahara. Parallel Processing Technology. Corona Publishing, Tokyo (in Japanese), Jun. 1991.

    Google Scholar 

  16. H. Kasahara, H. Honda, M. Iwata, and M. Hirota. A macro-dataflow compilation scheme for hierarchical multiprocessor systems. Proc. Int’l. Conf. on Parallel Processing, Aug. 1990.

    Google Scholar 

  17. H. Kasahara, H. Honda, and S. Narita. Parallel processing of near fine grain tasks using static scheduling on oscar. Proc. IEEE ACM Supercomputing’90, Nov. 1990.

    Google Scholar 

  18. H. Kasahara, S. Narita, and S. Hashimoto. Oscar’s architecture. Trans. IEICE (in Japanese), J71-D-I(8), Aug. 1988.

    Google Scholar 

  19. H. Kasahara, M. Okamoto, A. Yoshida, W. Ogata, K. Kimura, G. Matsui, H. Matsuzaki, and H. Honda. Oscar multi-grain architecture and its evaluation. Proc. International Workshop on Innovative Architecture for Future Generation High-Performance Processors and Systems, Oct. 1997.

    Google Scholar 

  20. Monica S. Lam. Locallity optimizations for parallel machines. Third Joint International Conference on Vector and Parallel Processing, Nov. 1994.

    Google Scholar 

  21. Jose E. Moreira and Constantine D. Polychronopoulos. Autoscheduling in a shared memory multiprocessor. CSRD Report No.1337, 1994.

    Google Scholar 

  22. M. Okamoto, K. Aida, M. Miyazawa, H. Honda, and H. Kasahara. A hierarchical macro-dataflow computation scheme of oscar multi-grain compiler. Trans. IPSJ, 35(4):513–521, Apr. 1994.

    Google Scholar 

  23. D.A. Padua and M.J. Wolfe. Advanced compiler optimizations for supercomputers. C.ACM, 29(12):1184–1201, Dec. 1986.

    Article  Google Scholar 

  24. P.M. Petersen and D.A. Padua. Static and dynamic evaluation of data dependence analysis. Proc. Int’l conf. on supemputing, Jun. 1993.

    Google Scholar 

  25. W. Pugh. The omega test: A fast and practical integer programming algorithm for dependence alysis. Proc. Supercomputing’91, 1991.

    Google Scholar 

  26. Lawrence Rauchwerger, Nancy M. Amato, and David A. Padua. Run-time methods for parallelizing partially parallel loops. Proceedings of the 9th ACM International Conference on Supercomputing, Barcelona, Spain, pages 137–146, Jul. 1995.

    Google Scholar 

  27. Gabriel Rivera and Chau-Wen Tseng. Locality optimizations for multi-level caches. Super Computing’ 99, Nov. 1999.

    Google Scholar 

  28. P. Tu and D. Padua. Automatic array privatization. Proc. 6th Annual Workshop on Languages and Compilers for Parallel Computing, 1993.

    Google Scholar 

  29. M. Wolfe. Optimizing supercompilers for supercomputers. MIT Press, 1989.

    Google Scholar 

  30. M. Wolfe. High Performance Compilers for Parallel Computing. Addison-Wesley, 1996.

    Google Scholar 

  31. Nacho Navaro Xavier Martorell, Jesus Labarta and Eduard Ayguade. A library implementation of the nano-threads programing model. Proc. of the Second International Euro-Par Conference, vol. 2, 1996.

    Google Scholar 

  32. A. Yoshida, K. Koshizuka, M. Okamoto, and H. Kasahara. A data-localization scheme among loops for each layer in hierarchical coarse grain parallel processing. Trans. of IPSJ, 40(5), May. 1999.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2000 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Ishizaka, K., Obata, M., Kasahara, H. (2000). Coarse-grain Task Parallel Processing Using the OpenMP Backend of the OSCAR Multigrain Parallelizing Compiler. In: Valero, M., Joe, K., Kitsuregawa, M., Tanaka, H. (eds) High Performance Computing. ISHPC 2000. Lecture Notes in Computer Science, vol 1940. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-39999-2_43

Download citation

  • DOI: https://doi.org/10.1007/3-540-39999-2_43

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-41128-4

  • Online ISBN: 978-3-540-39999-5

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics