A Fission Technique Enabling Parallelization of Imperfectly Nested Loops

Ju, Jialin; Chaudhary, Vipin

doi:10.1007/978-3-540-46642-0_13

Jialin Ju⁷ &
Vipin Chaudhary⁸

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1745))

Included in the following conference series:

International Conference on High-Performance Computing

268 Accesses

Abstract

This paper addresses the issue of parallelizing imperfectly nested loops. Current parallelizing compilers or transformations would either only parallelize the inner-most loop (which is more like vectorization than parallelization), or not parallelize the loops at all. We present an approach that transforms an imperfectly nested loop into at most three fully parallel perfectly nested loops. The transformed loops can be parallelized by any parallelizing compiler. The advantage of our technique is the simplicity of the transformed loops and low synchronization overhead. The feasibility of this approach was tested using several types of loops including those from the Eispack math library and from Linpack benchmark on different multi-processor platforms and performance was compared with Sun’s MP C and Cray’s autotasking. The results show that our method is very effective.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Ju, J., Chaudhary, V.: Unique sets oriented parallelization of loops with non-uniform dependences. The Computer Journal 40(6), 322–339 (1997)
Article Google Scholar
Shen, Z., Li, Z., Yew, P.C.: An empirical study on array subscripts and data dependencies. In: Proc. 1989 Int. Conf. on Parallel Processing, vol. II, pp. 145–152 (1989)
Google Scholar
Tzen, T.H., Ni, L.M.: Dependence uniformization: A loop parallelization tehnique. IEEE trans. Parallel and Distrib. Syst. 4, 547–558 (1993)
Article Google Scholar
Chen, D.L., Yew, P.C.: On effective execution of nonuniform doacross loops. IEEE trans. Parallel and Distrib. Syst. 7, 463–476 (1996)
Article Google Scholar
Punyamurtula, S., Chaudhary, V., Ju, J., Roy, S.: Compile time partitioning of nested loop iteration spaces with non-uniform dependences. J. Parallel Algor. and App. 12, 113–141 (1996)
Google Scholar
Zaafrani, A., Ito, M.R.: Parallel region execution of loops with irregular dependences. In: Proc. 1994 Int. Conf. Parallel Processing, vol. II, pp. 11–19 (1994)
Google Scholar
Kelly, W., Pugh, W.: Minimizing communication while preserving parallelism. In: Int. Conf. Supercomputing (1996)
Google Scholar
Lim, A.W., Lam, M.S.: Maximizing parallelism and minimizing synchronization with affine transforms. In: 24th Ann. ACM SIGPLAN-SIGACT Symp. Prin. Prog. Lang., Paris (1997)
Google Scholar
Sass, R., Mutka, M.: Enabling unimodular transformations. In: Proc. Supercomputing 1994, pp. 753–762 (1994)
Google Scholar
Ju, J.: Automatic Parallelization of Non-Uniform Loops. Ph.D. thesis. Wayne State University (1998)
Google Scholar

Download references

Author information

Authors and Affiliations

Pacific Northwest National Laboratory, Richland, WA, 99352, USA
Jialin Ju
Parallel and Distributed Computing Laboratory, Wayne State University, Detroit, MI, 48202, USA
Vipin Chaudhary

Authors

Jialin Ju
View author publications
You can also search for this author in PubMed Google Scholar
Vipin Chaudhary
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

College of Engineering, University of Illinois at Chicago, 851 South Morgan Street, IL 60607-7043, Chicago
Prith Banerjee
Department of Electrical Engineering, University of Southern California, CA 90089-2562, Los Angeles, USA
Viktor K. Prasanna
Indian Statistical Institute, ACM Unit, 700 108, Kolkata, India
Bhabani P. Sinha

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ju, J., Chaudhary, V. (1999). A Fission Technique Enabling Parallelization of Imperfectly Nested Loops. In: Banerjee, P., Prasanna, V.K., Sinha, B.P. (eds) High Performance Computing – HiPC’99. HiPC 1999. Lecture Notes in Computer Science, vol 1745. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-46642-0_13

Download citation

DOI: https://doi.org/10.1007/978-3-540-46642-0_13
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-66907-4
Online ISBN: 978-3-540-46642-0
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics