Static Coarse Grain Task Scheduling with Cache Optimization Using OpenMP

Nakano, Hirofumi; Ishizaka, Kazuhisa; Obata, Motoki; Kimura, Keiji; Kasahara, Hironori

doi:10.1023/A:1023038702472

Static Coarse Grain Task Scheduling with Cache Optimization Using OpenMP

Published: June 2003

Volume 31, pages 211–223, (2003)
Cite this article

International Journal of Parallel Programming Aims and scope Submit manuscript

Hirofumi Nakano¹,
Kazuhisa Ishizaka¹,
Motoki Obata¹,
Keiji Kimura¹ &
…
Hironori Kasahara¹

72 Accesses
2 Citations
Explore all metrics

Abstract

Effective use of cache memory is getting more important with increasing gap between the processor speed and memory access speed. Also, use of multigrain parallelism is getting more important to improve effective performance beyond the limitation of loop iteration level parallelism. Considering these factors, this paper proposes a coarse grain task static scheduling scheme considering cache optimization. The proposed scheme schedules coarse grain tasks to threads so that shared data among coarse grain tasks can be passed via cache after task and data decomposition considering cache size at compile time. It is implemented on OSCAR Fortran multigrain parallelizing compiler and evaluated on Sun Ultra80 four-processor SMP workstation using Swim and Tomcatv from the SPEC fp 95. As the results, the proposed scheme gives us 4.56 times speedup for Swim and 2.37 times on 4 processors for Tomcatv respectively against the Sun Forte HPC Ver. 6 update 1 loop parallelizing compiler.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Staccato: Cache-Aware Work-Stealing Task Scheduler for Shared-Memory Systems

Assessing Task-to-Data Affinity in the LLVM OpenMP Runtime

OpenMP Extension for Explicit Task Allocation on NUMA Architecture

References

APC. http://www.apc.waseda.ac.jp/.
M. Okamoto, K. Aida, M. Miyazawa, H. Honda, and H. Kasahara. A Hierarchical Macro-Dataflow Computation Scheme for Oscar Multi-Grain Compiler, Trans. IPSJ, 35(4):513–521 (1994).
Google Scholar
H. Kasahara, M. Obata, and K. Ishizaka, Automatic Coarse Grain Task Parallel Processing on smp Using openmp, In Proc. 12th Workshop on Languages and Compilers for Parallel Computing (August 2000).
A. W. Lim, G. I. Cheong, and M. S. Lam, An Affine Partitioning Algorithm to Maximize Parallelism and Minimize Communication. In Proc. 13th ACM SIGARCH International Conference on Supercomputing (June 1999).
A. W. Lim, S. Liao, and M. S. Lam, Blocking and Array Contraction Across Arbitrarily Nested Loops Using Affine Partitioning. In Proc. of the Eighth ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (June 2001).
A. W. Lim and M. S. Lam, Cache Optimizations with Affine Partitioning, In Proc. of the Tenth SIAM Conference on Parallel Processing for Scientific Computing (March 2001).
S. Vajracharya, S. Karmesin, P. Beckman, J. Crotinger, A. Malony, S. Shende, R. Oldehoeft, and S. Smith, Smarts: Exploiting Temporal Locality and Parallelism through Vertical Execution, In Proc. of the 1999 International Conference on Supercomputing (June 1999).
D. Inaishi, K. Kimura, K. Fujimoto, W. Ogata, M. Okamoto, and H. Kasahara, A Cache Optimization with Earliest Executable Condition Analysis, In Technical Report of IPSJ (August 1998).
K. Ishizaka, M. Obata, and H. Kasahara, Coarse Grain Task Parallel Processing with Cache Optimization on Shared Memory Multiprocessor, In Proc. 14th Workshop on Languages and Compilers for Parallel Computing (August 2001).
A. Yoshida, K. Koshizuka, M. Okamoto, and H. Kasahara. A Data-Localization Scheme among Loops for Each Layer in Hierarchical Coarse Grain Parallel Processing, Trans. IPSJ, 40(5):2054–2063 (1999).
Google Scholar
A. Yoshida, S. Yagi, and H. Kasahara. A Data-Localization Scheme for Macrotask-Graph with Data Dependencies on smp, In Technical Report of IPSJ, 2001-ARC-141 (January 2001).
H. Kasahara, Parallel Processing Technology, CORONA PUBLISHING CO., LTD. (1991).
H. Kasahara, H. Honda, A. Mogi, A. Ogura, K. Fujiwara, and S. Narita, A Multi-Grain Parallelizing Compilation Scheme for Oscar, Proc. 4th Workshop on Languages and Compilers for Parallel Computing (August 1991).

Download references

Author information

Authors and Affiliations

Waseda University, 3-4-1 Ohkubo, Shinjuku-ku, Tokyo, 169-8555, Japan
Hirofumi Nakano, Kazuhisa Ishizaka, Motoki Obata, Keiji Kimura & Hironori Kasahara

Authors

Hirofumi Nakano
View author publications
You can also search for this author in PubMed Google Scholar
Kazuhisa Ishizaka
View author publications
You can also search for this author in PubMed Google Scholar
Motoki Obata
View author publications
You can also search for this author in PubMed Google Scholar
Keiji Kimura
View author publications
You can also search for this author in PubMed Google Scholar
Hironori Kasahara
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Nakano, H., Ishizaka, K., Obata, M. et al. Static Coarse Grain Task Scheduling with Cache Optimization Using OpenMP. International Journal of Parallel Programming 31, 211–223 (2003). https://doi.org/10.1023/A:1023038702472

Download citation

Issue Date: June 2003
DOI: https://doi.org/10.1023/A:1023038702472

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Static Coarse Grain Task Scheduling with Cache Optimization Using OpenMP

Abstract

Access this article

Similar content being viewed by others

Staccato: Cache-Aware Work-Stealing Task Scheduler for Shared-Memory Systems

Assessing Task-to-Data Affinity in the LLVM OpenMP Runtime

OpenMP Extension for Explicit Task Allocation on NUMA Architecture

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Navigation

Static Coarse Grain Task Scheduling with Cache Optimization Using OpenMP

Abstract

Access this article

Similar content being viewed by others

Staccato: Cache-Aware Work-Stealing Task Scheduler for Shared-Memory Systems

Assessing Task-to-Data Affinity in the LLVM OpenMP Runtime

OpenMP Extension for Explicit Task Allocation on NUMA Architecture

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation