Abstract
This paper describes automatic coarse grain parallel processing on a shared memory multiprocessor system using a newly developed OpenMP backend of OSCAR multigrain parallelizing compiler for from single chip multiprocessor to a high performance multiprocessor and a heterogeneous supercomputer cluster. OSCAR multigrain parallelizing compiler exploits coarse grain task parallelism and near fine grain parallelism in addition to traditional loop parallelism. The OpenMP backend generates parallelized Fortran code with OpenMP directives based on analyzed multigrain parallelism by middle path of OSCAR compiler from an ordinary Fortran source program. The performance of multigrain parallel processing function by OpenMP backend is evaluated on an off the shelf eight processor SMP machine, IBM RS6000. The evaluation shows that the multigrain parallel processing gives us more than 2 times speed up compared with a commercial loop parallelizing compiler, IBM XL Fortran compiler, on the SMP machine.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
S. Amarasinghe, J. Anderson, M. Lam, and C. Tseng. The suif compiler for scalable parallel machines. Proc. of the 7th SIAM conference on parallel processing for scientific computing, 1995.
J. M. Anderson, S. P. Amarasinghe, and M. S. Lam. Data and computation transformations for multiprocessors. Proceedings of the Fifth ACM SIGPLAN Symposium on Principles and Practice of Parallel Processing, Jul. 1995.
U. Banerjee. Loop parallelization. Kluwer Academic Pub., 1994.
U. Barnerjee. Dependence analysis for supercomputing. Kluwer Pub., 1989.
Carrie J. Brownhill, Alexandru Nicolau, Steve Novack, and Constantine D. Polychronopoulos. Achieving multi-level parallelization. Proc. of ISHPC’97, Nov. 1997.
Rudolf Eigenmann, Jay Hoeinger, and David Padua. On the automatic parallelization of the perfect benchmarks. IEEE Trans. on parallel and distributed systems, 9(1), Jan. 1998.
H. Kasahara et al. A multi-grain parallelizing compilation scheme on oscar. Proc. 4th Workshop on Languages and Compilers for Parallel Computing, Aug. 1991.
M. Girkar and C.D. Polychronopoulos. Optimization of data/control conditions in task graphs. Proc. 4th Workshop on Languages and Compilers for Parallel Computing, Aug. 1991.
Mohammad R. Haghighat and Constantine D. Polychronopoulos. Symbolic Analysis for Parallelizing Compliers. Kluwer Academic Publishers, 1995.
M. W. Hall, B. R. Murphy, S. P. Amarasinghe, S. Liao,, and M. S. Lam. Interprocedural parallelization analysis: A case study. Proceedings of the 8th International Workshop on Languages and Compilers for Parallel Computing (LCPC95), Aug. 1995.
Mary W. Hall, Jennifer M. Anderson, Saman P. Amarasinghe, Brian R. Murphy, Shih-Wei Liao, Edouard Bugnion, and Monica S. Lam. Maximizing multiprocessor performance with the suif compiler. IEEE Computer, 1996.
Hwansoo Han, Gabriel Rivera, and Chau-Wen Tseng. Software support for improving locality in scientific codes. 8th Workshop on Compilers for Parallel Computers (CPC’2000), Jan. 2000.
H. Honda, M. Iwata, and H. Kasahara. Coarse grain parallelism detection scheme of fortran programs. Trans. IEICE (in Japanese), J73-D-I(12), Dec. 1990.
IBM. XL Fortran for AIX Language Reference.
H. Kasahara. Parallel Processing Technology. Corona Publishing, Tokyo (in Japanese), Jun. 1991.
H. Kasahara, H. Honda, M. Iwata, and M. Hirota. A macro-dataflow compilation scheme for hierarchical multiprocessor systems. Proc. Int’l. Conf. on Parallel Processing, Aug. 1990.
H. Kasahara, H. Honda, and S. Narita. Parallel processing of near fine grain tasks using static scheduling on oscar. Proc. IEEE ACM Supercomputing’90, Nov. 1990.
H. Kasahara, S. Narita, and S. Hashimoto. Oscar’s architecture. Trans. IEICE (in Japanese), J71-D-I(8), Aug. 1988.
H. Kasahara, M. Okamoto, A. Yoshida, W. Ogata, K. Kimura, G. Matsui, H. Matsuzaki, and H. Honda. Oscar multi-grain architecture and its evaluation. Proc. International Workshop on Innovative Architecture for Future Generation High-Performance Processors and Systems, Oct. 1997.
Monica S. Lam. Locallity optimizations for parallel machines. Third Joint International Conference on Vector and Parallel Processing, Nov. 1994.
Jose E. Moreira and Constantine D. Polychronopoulos. Autoscheduling in a shared memory multiprocessor. CSRD Report No.1337, 1994.
M. Okamoto, K. Aida, M. Miyazawa, H. Honda, and H. Kasahara. A hierarchical macro-dataflow computation scheme of oscar multi-grain compiler. Trans. IPSJ, 35(4):513–521, Apr. 1994.
D.A. Padua and M.J. Wolfe. Advanced compiler optimizations for supercomputers. C.ACM, 29(12):1184–1201, Dec. 1986.
P.M. Petersen and D.A. Padua. Static and dynamic evaluation of data dependence analysis. Proc. Int’l conf. on supemputing, Jun. 1993.
W. Pugh. The omega test: A fast and practical integer programming algorithm for dependence alysis. Proc. Supercomputing’91, 1991.
Lawrence Rauchwerger, Nancy M. Amato, and David A. Padua. Run-time methods for parallelizing partially parallel loops. Proceedings of the 9th ACM International Conference on Supercomputing, Barcelona, Spain, pages 137–146, Jul. 1995.
Gabriel Rivera and Chau-Wen Tseng. Locality optimizations for multi-level caches. Super Computing’ 99, Nov. 1999.
P. Tu and D. Padua. Automatic array privatization. Proc. 6th Annual Workshop on Languages and Compilers for Parallel Computing, 1993.
M. Wolfe. Optimizing supercompilers for supercomputers. MIT Press, 1989.
M. Wolfe. High Performance Compilers for Parallel Computing. Addison-Wesley, 1996.
Nacho Navaro Xavier Martorell, Jesus Labarta and Eduard Ayguade. A library implementation of the nano-threads programing model. Proc. of the Second International Euro-Par Conference, vol. 2, 1996.
A. Yoshida, K. Koshizuka, M. Okamoto, and H. Kasahara. A data-localization scheme among loops for each layer in hierarchical coarse grain parallel processing. Trans. of IPSJ, 40(5), May. 1999.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2000 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ishizaka, K., Obata, M., Kasahara, H. (2000). Coarse-grain Task Parallel Processing Using the OpenMP Backend of the OSCAR Multigrain Parallelizing Compiler. In: Valero, M., Joe, K., Kitsuregawa, M., Tanaka, H. (eds) High Performance Computing. ISHPC 2000. Lecture Notes in Computer Science, vol 1940. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-39999-2_43
Download citation
DOI: https://doi.org/10.1007/3-540-39999-2_43
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-41128-4
Online ISBN: 978-3-540-39999-5
eBook Packages: Springer Book Archive