Skip to main content

Hierarchical Parallelism Control for Multigrain Parallel Processing

  • Conference paper
Languages and Compilers for Parallel Computing (LCPC 2002)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 2481))

Abstract

To improve effective performance and usability of shared memory multiprocessor systems, a multi-grain compilation scheme, which hierarchically exploits coarse grain parallelism among loops, subroutines and basic blocks, conventional loop parallelism and near fine grain parallelism among statements inside a basic block, is important. In order to efficiently use hierarchical parallelism of each nest level, or layer, in multigrain parallel processing, it is required to determine how many processors or groups of processors should be assigned to each layer, according to the parallelism of the layer. This paper proposes an automatic hierarchical parallelism control scheme to assign suitable number of processors to each layer so that the parallelism of each hierarchy can be used efficiently. Performance of the proposed scheme is evaluated on IBM RS6000 SMP server with 8 processors using 8 programs of SPEC95FP.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Wolfe, M.: High Performance Compilers for Parallel Computing. Addison-Wesley, Reading (1996)

    MATH  Google Scholar 

  2. Banerjee, U.: Loop parallelization. Kluwer Academic Pub., Dordrecht (1994)

    MATH  Google Scholar 

  3. Polaris, http://polaris.cs.uiuc.edu/polaris/

  4. Eigenmann, R., Hoeflinger, J., Padua, D.: On the automatic parallelization of the perfect benchmarks. IEEE Trans. on parallel and distributed systems 9 (1998)

    Google Scholar 

  5. Rauchwerger, L., Amato, N.M., Padua, D.A.: Run-time methods for parallelizing partially parallel loops. In: Proceedings of the 9th ACM International Conference on Supercomputing, Barcelona, Spain, pp. 137–146 (1995)

    Google Scholar 

  6. Tu, P., Padua, D.: Automatic array privatization. In: Proc. 6th Annual Workshop on Languages and Compilers for Parallel Computing (1993)

    Google Scholar 

  7. Hall, M.W., Murphy, B.R., Amarasinghe, S.P., Liao, S., Lam, M.S.: Interprocedural parallelization analysis: A case study. In: Huang, C.-H., Sadayappan, P., Banerjee, U., Gelernter, D., Nicolau, A., Padua, D.A. (eds.) LCPC 1995. LNCS, vol. 1033. Springer, Heidelberg (1996)

    Chapter  Google Scholar 

  8. Hall, M.W., Anderson, J.M., Amarasinghe, S.P., Murphy, B.R., Liao, S.W., Bugnion, E., Lam, M.S.: Maximizing multiprocessor performance with the suif compiler. IEEE Computer (1996)

    Google Scholar 

  9. Amarasinghe, S., Anderson, J., Lam, M., Tseng, C.: The suif compiler for scalable parallel machines. In: Proc. of the 7th SIAM conference on parallel processing for scientific computing (1995)

    Google Scholar 

  10. Lam, M.S.: Locallity optimizations for parallel machines. In: Third Joint International Conference on Vector and Parallel Processing (1994)

    Google Scholar 

  11. Lim, A.W., Lam., M.S.: Cache optimizations with affine partitioning. In: Proceedings of the Tenth SIAM Conference on Parallel Processing for Scientific Computing (2001)

    Google Scholar 

  12. Yoshida, A., Koshizuka, K., Okamoto, M., Kasahara, H.: A data-localization scheme among loops for each layer in hierarchical coarse grain parallel processing. Trans. of IPSJ 40 (1999) (Japanese)

    Google Scholar 

  13. Rivera, G., Tseng, C.W.: Locality optimizations for multi-level caches. In: Super Computing 1999 (1999)

    Google Scholar 

  14. Han, H., Rivera, G., Tseng, C.W.: Software support for improving locality in scientific codes. In: 8th Workshop on Compilers for Parallel Computers, CPC 2000 (2000)

    Google Scholar 

  15. Kasahara, H., Honda, H., Mogi, A., Ogura, A., Fujiwara, K., Narita, S.: A multigrain parallelizing compilation scheme on oscar. In: Proc. 4th Workshop on Languages and Compilers for Parallel Computing (1991)

    Google Scholar 

  16. Okamoto, M., Aida, K., Miyazawa, M., Honda, H., Kasahara, H.: A hierarchical macro-dataflow computation scheme of oscar multi-grain compiler. Trans. of IPSJ 35, 513–521 (1994) (Japanese)

    Google Scholar 

  17. Kasahara, H., Okamoto, M., Yoshida, A., Ogata, W., Kimura, K., Matsui, G., Matsuzaki, H., Honda, H.: Oscar multi-grain architecture and its evaluation. In: Proc. International Workshop on Innovative Architecture for Future Generation High- Performance Processors and Systems (1997)

    Google Scholar 

  18. Kasahara, H., Honda, H., Iwata, M., Hirota, M.: A macro-dataflow compilation scheme for hierarchical multiprocessor systems. In: Proc. Int’l. Conf. on Parallel Processing (1990)

    Google Scholar 

  19. Honda, H., Iwata, M., Kasahara, H.: Coarse grain parallelism detection scheme of fortran programs. Trans. IEICE J73-D-I (1990) (in Japanese)

    Google Scholar 

  20. Kasahara, H.: Parallel Processing Technology. Corona Publishing, Tokyo (1991) (in Japanese)

    Google Scholar 

  21. Kasahara, H., Obata, M., Ishizaka, K.: Automatic coarse grain task parallel processing on smp using openmp. In: Midkiff, S.P., Moreira, J.E., Gupta, M., Chatterjee, S., Ferrante, J., Prins, J.F., Pugh, B., Tseng, C.-W. (eds.) LCPC 2000. LNCS, vol. 2017, p. 189. Springer, Heidelberg (2001)

    Chapter  Google Scholar 

  22. Kasahara, H., Honda, H., Narita, S.: Parallel processing of near fine grain tasks using static scheduling on oscar. In: Proc. IEEE ACM Supercomputing 1990 (1990)

    Google Scholar 

  23. Kimura, K., Kato, T., Kasahara, H.: Evaluation of processor core architecture for single chip multiprocessor with near fine grain parallel processing. Trans. of IPSJ 42 (2001) (Japanese)

    Google Scholar 

  24. Martorell, X., Ayguade, E., Navarro, N., Corbalan, J., Gozalez, M., Labarta, J.: Thread fork/join techniques for multi-level parllelism exploitation in numa multiprocessors. In: ICS 1999, Rhodes, Greece (1999)

    Google Scholar 

  25. Ayguade, E., Martorell, X., Labarta, J., Gonzalez, M., Navarro, N.: Exploiting multiple levels of parallelism in openmp: A case study. In: ICPP 1999 (1999)

    Google Scholar 

  26. PROMIS, http://www.csrd.uiuc.edu/promis/

  27. Brownhill, C.J., Nicolau, A., Novack, S., Polychronopoulos, C.D.: Achieving multilevel parallelization. In: Araki, K., Joe, K., Polychronopoulos, C.D. (eds.) ISHPC 1997. LNCS, vol. 1336. Springer, Heidelberg (1997)

    Chapter  Google Scholar 

  28. Parafrase2, http://www.csrd.uiuc.edu/parafrase2/

  29. Girkar, M., Polychronopoulos, C.: Optimization of data/control conditions in task graphs. In: Proc. 4th Workshop on Languages and Compilers for Parallel Computing (1991)

    Google Scholar 

  30. Haghighat, M.R., Polychronopoulos, C.D.: Symbolic Analysis for Parallelizing Compliers. Kluwer Academic Publishers, Dordrecht (1995)

    Google Scholar 

  31. http://www.apc.waseda.ac.jp/

  32. Kasahara, H., Obata, M., Ishizaka, K.: Coarse grain task parallel processing on a shared memory multiprocessor system. Trans. of IPSJ 42 (2001) (Japanese)

    Google Scholar 

  33. Obata, M., Ishizaka, K., Kasahara, H.: Automatic coarse grain task parallel processing using oscar multigrain parallelizing compiler. In: Ninth International Workshop on Compilers for Parallel Computers, CPC 2001 (2001)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Obata, M., Shirako, J., Kaminaga, H., Ishizaka, K., Kasahara, H. (2005). Hierarchical Parallelism Control for Multigrain Parallel Processing. In: Pugh, B., Tseng, CW. (eds) Languages and Compilers for Parallel Computing. LCPC 2002. Lecture Notes in Computer Science, vol 2481. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11596110_3

Download citation

  • DOI: https://doi.org/10.1007/11596110_3

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-30781-5

  • Online ISBN: 978-3-540-31612-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics