Optimized Execution of Fortran 90 Array Language on Symmetric Shared-Memory Multiprocessors

Sarkar, Vivek

doi:10.1007/3-540-48319-5_9

Vivek Sarkar⁶

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1656))

Included in the following conference series:

International Workshop on Languages and Compilers for Parallel Computing

276 Accesses

Abstract

Past compilers have found it challenging to implement Fortran 90 array language on symmetric shared-memory multiprocessors (SMPs) so as to match, let alone beat, the performance of comparable Fortran 77 scalar loops. This is in spite of the fact that the semantics of array language is implicitly concurrent and the semantics of scalar loops is implicitly sequential. A well known obstacle to efficient execution of array language lies in the overhead of using array temporaries to obey the fetch-before-store semantics of array language. We observe that another major obstacle to supporting array language efficiently arises from the fact that most past compilers attempted to compile and optimize each array statement in isolation.

In this paper, we describe a solution for optimized compilation of Fortran 90 array language for execution on SMPs. Our solution optimizes scalarized loops and scalar loops in a common framework. Our solution also adapts past work on array temporary minimization so as to avoid degradation of parallelism and locality. This solution has been implemented in the IBM XL Fortran product compiler for SMPs. To the best of our knowledge, no other Fortran 90 compiler performs such combined optimizations of scalarized loops and scalar loops. Our preliminary experimental results indicate that the performance of Fortran 90 array language can match, and even beat, the performance of comparable scalar loops. In addition to Fortran 90 array language, the approach outlined in this paper will be relevant to similar array language extensions that might appear in Java and other programming languages in the future.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Reference

John R. Allen. Dependence Analysis for Subscripted Variables and its Application to Program Transformation. PhD thesis, Rice University, Houston, TX, 1983.
Google Scholar
Utpal Banerjee. Unimodular Transformations of Double Loops. Proceedings of the Third Workshop on Languages and Compilers for Parallel Computing, August 1990.
Google Scholar
B. Carpenter, Y.-J. Chang, G. Fox, and X. Li. Java as a Language for Scientific Parallel Programming. In Languages and compilers for parallel computing. Proceedings of the 10th international workshop. Held Aug., 1997 in Minneapolis, MN., Lecture Notes in Computer Science. Springer-Verlag, New York, 1998.
Google Scholar
Jyh-Herng Chow, Leonard E. Lyon, and Vivek Sarkar. Automatic Parallelization for Symmetric Shared-Memory Multiprocessors. CASCON’ 96 conference, November 1996.
Google Scholar
Jeanne Ferrante, Vivek Sarkar, and Wendy Thrash. On Estimating and Enhancing Cache Effectiveness. Lecture Notes in Computer Science, (589):328–343, 1991. Proceedings of the Fourth International Workshop on Languages and Compilers for Parallel Computing, Santa Clara, California, USA, August 1991. Edited by U. Banerjee, D. Gelernter, A. Nicolau, D. Padua.
Google Scholar
Francois Irigoin and Remi Triolet. Supernode Partitioning. Conference Record of Fifteenth ACM Symposium on Principles of Programming Languages, 1988.
Google Scholar
D. H. Kulkarni, S. Tandri, L. Martin, N. Copty, R. Silvera, X. Tian, X. Xue, and J. Wang. XL Fortran Compiler for IBM SMP Systems. AIXpert, pages 312–322, December 1997.
Google Scholar
Kathryn S. McKinley, Steve Carr, and Chau-Wen Tseng. Improving Data Locality with Loop Transformations. ACM Transactions on Programming Languages and Systems, 18:423–453, July 1996.
Article Google Scholar
Nimrod Megiddo and Vivek Sarkar. Optimal Weighted Loop Fusion for Parallel Programs. Proceedings of the Ninth Annual ACM Symposium on Parallel Algorithms and Architecture, pages 282–291, June 1997.
Google Scholar
M. Metcalfe and J. Reid. Fortran 90 Explained. Oxford Science Publishers, 1990.
Google Scholar
Constantine D. Polychronopoulos and David J. Kuck. Guided Self-Scheduling: A Practical Scheduling Scheme for Parallel Supercomputers. IEEE Transactions on Computers, C-36(12), December 1987.
Google Scholar
John K. Prentice. Performance Benchmarks for Optimizing Fortran 90 Compilers. Fortran Journal, pages 6–12, May/June 1995.
Google Scholar
Vivek Sarkar. Automatic Selection of High Order Transformations in the IBM XL Fortran Compilers. IBM Journal of Research and Development, 41(3), May 1997.
Google Scholar
Vivek Sarkar. Loop Transformations for Hierarchical Parallelism and Locality. Lecture Notes in Computer Science, 1151, 1998. Proceedings of LCR98: Fourth Workshop on Languages, Compilers, and Run-time Systems for Scalable Computers. Held in May 1998 at Carnegie Mellon University, Pittsburgh, PA, USA.
Google Scholar
Vivek Sarkar and Guang R. Gao. Optimization of Array Accesses by Collective Loop Transformations. Proceedings of the ACM 1991 International Conference on Supercomputing, pages 194–205, June 1991. Cologne, Germany.
Google Scholar
Vivek Sarkar and Radhika Thekkath. A General Framework for Iteration-Reordering Loop Transformations. Proceedings of the ACM SIGPLAN’ 92 Conference on Programming Language Design and Implementation, pages 175–187, June 1992.
Google Scholar
Michael E. Wolf and Monica S. Lam. A Data Locality Optimization Algorithm. Proceedings of the ACM SIGPLAN Symposium on Programming Language Design and Implementation, pages 30–44, June 1991.
Google Scholar
Michael E. Wolf and Monica S. Lam. A Loop Transformation Theory and an Algorithm toMaximize Parallelism. IEEE Transactions on Parallel and Distributed Systems, 2(4):452–471, October 1991.
Article Google Scholar
Michael J. Wolfe. Optimizing Supercompilers for Supercomputers. Pitman, London and The MIT Press, Cambridge, Massachusetts, 1989. In the series, Research Monographs in Parallel and Distributed Computing.
MATH Google Scholar

Download references

Author information

Authors and Affiliations

IBM Thomas J. Watson Research Center, P. O. Box 704, Yorktown Heights, NY, 10598, USA
Vivek Sarkar

Authors

Vivek Sarkar
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, The University of North Carolina, Chapel Hill, NC, 27599-3175, USA
Siddhartha Chatterjee & Jan F. Prins &
Department of Computer Science and Engineering, University of California at San Diego, 9500 Gilman Drive, La Jolla, CA, 92093-0114, USA
Larry Carter & Jeanne Ferrante &
Department of Computer Science, Purdue University, 1398 Computer Science Building, West Lafayette, IN, 47907, USA
Zhiyuan Li
Intel Corporation, 2200 Mission College Boulevard, RN6-18, Santa Clara, CA, 95052, USA
David Sehr
Department of Computer Science and Engineering, University of Minnesota, Minneapolis, MN, 55455, USA
Pen-Chung Yew

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Sarkar, V. (1999). Optimized Execution of Fortran 90 Array Language on Symmetric Shared-Memory Multiprocessors. In: Chatterjee, S., et al. Languages and Compilers for Parallel Computing. LCPC 1998. Lecture Notes in Computer Science, vol 1656. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-48319-5_9

Download citation

DOI: https://doi.org/10.1007/3-540-48319-5_9
Published: 12 May 2000
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-66426-0
Online ISBN: 978-3-540-48319-9
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics