Abstract
This paper describes an automatic parallelising compiler, MARS, targeted for shared memory machines. It uses a data partitioning approach, traditionally used for distributed memory machines, in order to globally reduce overheads such as communication and synchronisation. Its high-level linear algebraic representation allows direct application of, for instance, unimodular transformations and global application of data transformation. Although a data based approach allows global analysis and in many instances outperforms local, loop-orientated parallelisation approaches, we have identified two particular problems when applying data parallelism to sequential Fortran 77 as opposed to data parallel dialects tailored to distributed memory targets. This paper describes two techniques to overcome these problems and evaluates their applicability. Preliminary results, on two SPECf92 benchmarks, show that with these optimisations, MARS outperforms existing state-of-the art loop based auto-parallelisation approaches.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Abdelrahman T., Manjikian N., Liu G. and Tandri S., Locality Enhancement for Large-Scale Shared-Memory Multiprocessors, LCR 98, Fourth Workshop on Languages, Compilers and Runtime Systems, Springer-Verlag, 1998.
Beckner S., Neuhold C., Egger M., Sajari K., Sipkova V. and Velkov B., VFC The Vienna HPF+ Compiler Compilers for Parallel Computers Sweden, July 1998.
Bodin F. and O’Boyle M.F.P., A Compiler Strategy for SVM Third Workshop on Languages, Compilers and Runtime Systems, New York, Kluwer Press, May 1995.
Bodin F., Beckman P., Gannon D. and Srinivas J.G.S., Sage++: A Class Library for Building Fortran and C++ Restructuring Tools, Second Object-Oriented Numerics Conference, Oregon (USA), April 1994.
Chamski Z. and O’Boyle M.F.P., Practical Loop Generation HICSS-29, Hawaii International Conference on System Sciences, IEEE Press, Hawaii, January 1996.
Choudhary A., Fox G., Hiranandani S., Kennedy K., Koelbel C., Ranka S. and Tseng C.-W., Unified Compilation of Fortran 77D and 90D, ACM Letters on Programming Languages and Systems Vol2 Nos 1–4, March–December 1993.
Cierniak M. and Li W., Unifying Data and Control Transformations for Distributed Shared-Memory Machines, Programming Language Design and Implementation, June 1995.
M. Cierniak and W. Li. Validity of Interprocedural Data Remapping, Technical Report 642, Department of Computer Science, University of Rochester, November 1996.
Cytron R. and Ferrante J., What’s in a name or the value of renaming for parallelism detection and storage allocation, Proc ICPP, 1987.
Gupta M., On privatization for Data-Parallel Execution IPPS’ 98, IEEE Press, Geneva, April 1998
Hall M., Anderson J., Amarsinghe S., Murphy B., Liao S., Bugnion E. and Lam M., Maximizing multiprocessor performance with the SUIF compiler, IEEE Computer 29(12):84–89, 1996.
Kandemir M., Ramanujam J. and Choudhary A., A Compiler Algorithm for Optimizing Locality in Loop Nests, Proc. of International Conference on Super-computing, ACM Press, Vienna, July 1997
Lahjomri Z. and Priol T., Koan: A Shared Virtual Memory for the iPSC/2 Hypercube, Proceedings of CONPAR/VAPP92, Lyon, September 1992.
Li Z., Array Privatization for Parallel Execution of Loops, ICS, 6th ACM International Conference on Supercomputing, ACM Press, Washington, July 1992.
O’Boyle M.F.P., A Data Partitioning Algorithm for Distributed Memory Compilation, PARLE’ 94: Parallel Architectures and Languages Europe, LNCS 817 Springer-Verlag, Athens July 1994.
O’Boyle M.F.P., Ford R.W and Nisbet A.P., A Compiler Algorithm to Reduce Invalidation Latency in Virtual Shared Memory Systems, PACT’ 96, Parallel Architectures and Compiler Technology, IEEE Press, Boston, October 1996.
O’Boyle M.F.P. and P.M.W. Knijnenburg, Integrating Loop and Data Transformations for Global Optimisation, to appear PACT’ 98, Parallel Architectures and Compiler Technology, IEEE Press, Paris, October 1998.
O’Boyle M.F.P., Kervella L. and Bodin F., Synchronisation Minimisation in a SPMD Execution Model, Journal of Parallel and Distributed Computing Vol 29. 196–210, Academic Press Inc., September 1995.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1998 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
O’Boyle, M.F.P. (1998). MARS: A Distributed Memory Approach to Shared Memory Compilation. In: O’Hallaron, D.R. (eds) Languages, Compilers, and Run-Time Systems for Scalable Computers. LCR 1998. Lecture Notes in Computer Science, vol 1511. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-49530-4_19
Download citation
DOI: https://doi.org/10.1007/3-540-49530-4_19
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-65172-7
Online ISBN: 978-3-540-49530-7
eBook Packages: Springer Book Archive