Coarse-grain Task Parallel Processing Using the OpenMP Backend of the OSCAR Multigrain Parallelizing Compiler

Ishizaka, Kazuhisa; Obata, Motoki; Kasahara, Hironori

doi:10.1007/3-540-39999-2_43

Kazuhisa Ishizaka⁸,
Motoki Obata⁸ &
Hironori Kasahara⁸

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1940))

Included in the following conference series:

International Symposium on High Performance Computing

613 Accesses
7 Citations

Abstract

This paper describes automatic coarse grain parallel processing on a shared memory multiprocessor system using a newly developed OpenMP backend of OSCAR multigrain parallelizing compiler for from single chip multiprocessor to a high performance multiprocessor and a heterogeneous supercomputer cluster. OSCAR multigrain parallelizing compiler exploits coarse grain task parallelism and near fine grain parallelism in addition to traditional loop parallelism. The OpenMP backend generates parallelized Fortran code with OpenMP directives based on analyzed multigrain parallelism by middle path of OSCAR compiler from an ordinary Fortran source program. The performance of multigrain parallel processing function by OpenMP backend is evaluated on an off the shelf eight processor SMP machine, IBM RS6000. The evaluation shows that the multigrain parallel processing gives us more than 2 times speed up compared with a commercial loop parallelizing compiler, IBM XL Fortran compiler, on the SMP machine.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

S. Amarasinghe, J. Anderson, M. Lam, and C. Tseng. The suif compiler for scalable parallel machines. Proc. of the 7th SIAM conference on parallel processing for scientific computing, 1995.
Google Scholar
J. M. Anderson, S. P. Amarasinghe, and M. S. Lam. Data and computation transformations for multiprocessors. Proceedings of the Fifth ACM SIGPLAN Symposium on Principles and Practice of Parallel Processing, Jul. 1995.
Google Scholar
U. Banerjee. Loop parallelization. Kluwer Academic Pub., 1994.
Google Scholar
U. Barnerjee. Dependence analysis for supercomputing. Kluwer Pub., 1989.
Google Scholar
Carrie J. Brownhill, Alexandru Nicolau, Steve Novack, and Constantine D. Polychronopoulos. Achieving multi-level parallelization. Proc. of ISHPC’97, Nov. 1997.
Google Scholar
Rudolf Eigenmann, Jay Hoeinger, and David Padua. On the automatic parallelization of the perfect benchmarks. IEEE Trans. on parallel and distributed systems, 9(1), Jan. 1998.
Google Scholar
H. Kasahara et al. A multi-grain parallelizing compilation scheme on oscar. Proc. 4th Workshop on Languages and Compilers for Parallel Computing, Aug. 1991.
Google Scholar
M. Girkar and C.D. Polychronopoulos. Optimization of data/control conditions in task graphs. Proc. 4th Workshop on Languages and Compilers for Parallel Computing, Aug. 1991.
Google Scholar
Mohammad R. Haghighat and Constantine D. Polychronopoulos. Symbolic Analysis for Parallelizing Compliers. Kluwer Academic Publishers, 1995.
Google Scholar
M. W. Hall, B. R. Murphy, S. P. Amarasinghe, S. Liao,, and M. S. Lam. Interprocedural parallelization analysis: A case study. Proceedings of the 8th International Workshop on Languages and Compilers for Parallel Computing (LCPC95), Aug. 1995.
Google Scholar
Mary W. Hall, Jennifer M. Anderson, Saman P. Amarasinghe, Brian R. Murphy, Shih-Wei Liao, Edouard Bugnion, and Monica S. Lam. Maximizing multiprocessor performance with the suif compiler. IEEE Computer, 1996.
Google Scholar
Hwansoo Han, Gabriel Rivera, and Chau-Wen Tseng. Software support for improving locality in scientific codes. 8th Workshop on Compilers for Parallel Computers (CPC’2000), Jan. 2000.
Google Scholar
H. Honda, M. Iwata, and H. Kasahara. Coarse grain parallelism detection scheme of fortran programs. Trans. IEICE (in Japanese), J73-D-I(12), Dec. 1990.
Google Scholar
IBM. XL Fortran for AIX Language Reference.
Google Scholar
H. Kasahara. Parallel Processing Technology. Corona Publishing, Tokyo (in Japanese), Jun. 1991.
Google Scholar
H. Kasahara, H. Honda, M. Iwata, and M. Hirota. A macro-dataflow compilation scheme for hierarchical multiprocessor systems. Proc. Int’l. Conf. on Parallel Processing, Aug. 1990.
Google Scholar
H. Kasahara, H. Honda, and S. Narita. Parallel processing of near fine grain tasks using static scheduling on oscar. Proc. IEEE ACM Supercomputing’90, Nov. 1990.
Google Scholar
H. Kasahara, S. Narita, and S. Hashimoto. Oscar’s architecture. Trans. IEICE (in Japanese), J71-D-I(8), Aug. 1988.
Google Scholar
H. Kasahara, M. Okamoto, A. Yoshida, W. Ogata, K. Kimura, G. Matsui, H. Matsuzaki, and H. Honda. Oscar multi-grain architecture and its evaluation. Proc. International Workshop on Innovative Architecture for Future Generation High-Performance Processors and Systems, Oct. 1997.
Google Scholar
Monica S. Lam. Locallity optimizations for parallel machines. Third Joint International Conference on Vector and Parallel Processing, Nov. 1994.
Google Scholar
Jose E. Moreira and Constantine D. Polychronopoulos. Autoscheduling in a shared memory multiprocessor. CSRD Report No.1337, 1994.
Google Scholar
M. Okamoto, K. Aida, M. Miyazawa, H. Honda, and H. Kasahara. A hierarchical macro-dataflow computation scheme of oscar multi-grain compiler. Trans. IPSJ, 35(4):513–521, Apr. 1994.
Google Scholar
D.A. Padua and M.J. Wolfe. Advanced compiler optimizations for supercomputers. C.ACM, 29(12):1184–1201, Dec. 1986.
Article Google Scholar
P.M. Petersen and D.A. Padua. Static and dynamic evaluation of data dependence analysis. Proc. Int’l conf. on supemputing, Jun. 1993.
Google Scholar
W. Pugh. The omega test: A fast and practical integer programming algorithm for dependence alysis. Proc. Supercomputing’91, 1991.
Google Scholar
Lawrence Rauchwerger, Nancy M. Amato, and David A. Padua. Run-time methods for parallelizing partially parallel loops. Proceedings of the 9th ACM International Conference on Supercomputing, Barcelona, Spain, pages 137–146, Jul. 1995.
Google Scholar
Gabriel Rivera and Chau-Wen Tseng. Locality optimizations for multi-level caches. Super Computing’ 99, Nov. 1999.
Google Scholar
P. Tu and D. Padua. Automatic array privatization. Proc. 6th Annual Workshop on Languages and Compilers for Parallel Computing, 1993.
Google Scholar
M. Wolfe. Optimizing supercompilers for supercomputers. MIT Press, 1989.
Google Scholar
M. Wolfe. High Performance Compilers for Parallel Computing. Addison-Wesley, 1996.
Google Scholar
Nacho Navaro Xavier Martorell, Jesus Labarta and Eduard Ayguade. A library implementation of the nano-threads programing model. Proc. of the Second International Euro-Par Conference, vol. 2, 1996.
Google Scholar
A. Yoshida, K. Koshizuka, M. Okamoto, and H. Kasahara. A data-localization scheme among loops for each layer in hierarchical coarse grain parallel processing. Trans. of IPSJ, 40(5), May. 1999.
Google Scholar

Download references

Author information

Authors and Affiliations

Waseda University, 3-4-1 Ohkubo, 169-8555, Tokyo, Shinjuku-ku, Japan
Kazuhisa Ishizaka, Motoki Obata & Hironori Kasahara

Authors

Kazuhisa Ishizaka
View author publications
You can also search for this author in PubMed Google Scholar
Motoki Obata
View author publications
You can also search for this author in PubMed Google Scholar
Hironori Kasahara
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Departamento de Arquitectura de Computadores, Universidad Politecnica de Catalunya, Spain
Mateo Valero
Department of Information and Computer Sciences, Nara Women’s University, Japan
Kazuki Joe
Institute of Industrial Science Center for Conceptual Information Processing Research, University of Tokyo, Japan
Masaru Kitsuregawa
Graduate School of Engineering Electrical Engineering Department, University of Tokyo, Japan
Hidehiko Tanaka

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ishizaka, K., Obata, M., Kasahara, H. (2000). Coarse-grain Task Parallel Processing Using the OpenMP Backend of the OSCAR Multigrain Parallelizing Compiler. In: Valero, M., Joe, K., Kitsuregawa, M., Tanaka, H. (eds) High Performance Computing. ISHPC 2000. Lecture Notes in Computer Science, vol 1940. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-39999-2_43

Download citation

DOI: https://doi.org/10.1007/3-540-39999-2_43
Published: 06 April 2001
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-41128-4
Online ISBN: 978-3-540-39999-5
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics