Abstract
This paper discusses an approach to implement OpenMP on clusters by translating it to Global Arrays (GA). The basic translation strategy from OpenMP to GA is described. GA requires a data distribution; we do not expect the user to supply this; rather, we show how we perform data distribution and work distribution according to OpenMP static loop scheduling. An inspector-executor strategy is employed for irregular applications in order to gather information on accesses to potentially non-local data, group non-local data transfers and overlap communications with local computations. Furthermore, a new directive INVARIANT is proposed to provide information about the dynamic scope of data access patterns. This directive can help us generate efficient codes for irregular applications using the inspector-executor approach. Our experiments show promising results for the corresponding regular and irregular GA codes.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Bachler, G., Greimel, R.: Parallel CFD in the Industrial Environment, Unicom Seminars, London (1994)
Bircsak, J., Craig, P., Crowell, R., Cvetanovic, Z., Harris, J., Nelson, C.A., Offner, C.D.: Extending OpenMP for NUMA machines. Scientific Programming 8(3) (2000)
Chakrabarti, S., Gupta, M., Choi, J.-D.: Global Communication Analysis and Optimization. In: SIGPLAN Conference on Programming Language Design and Implementation, pp. 68–78 (1996)
Costa, J.J., Cortes, T., Martorell, X., Ayguade, E., Labarta, J.: Running OpenMP Applications Efficiently on an Everything-Shared SDSM. In: Proceedings of the 18th International Parallel and Distributed Processing Symposium (IPDPS 2004), IEEE, Los Alamitos (2004)
Das, R., Uysal, M., Saltz, J., Hwang, Y.-S.: Communication Optimizations for Irregular Scientific Computations on Distributed Memory Architectures. Journal of Parallel and Distributed Computing 22(3), 462–479 (1994)
Eigenmann, R., et al.: Is OpenMP for Grids?” Workshop on Next-Generation Systems. In: Int’l Parallel and Distributed Processing Symposium (IPDPS 2002) (May 2002)
Fagerström, J., Faxen, M.P., Ynnerman, A., Desplat, J.-C.: High Performance Computing Development for the Next Decade, and its Implications for Molecular Modeling Applications. In: Daily News and Information for the Global Grid Community, October 28, vol. 1(20) (2002), http://www.enacts.org/hpcroadmap.pdf
He, X., Luo, L.-S.: Theory of the Lattice Boltzmann Method: From the Boltzmann Equation to the Lattice Boltzmann Equation. Phys. Rev. Lett. E 6(56), 6811 (1997)
Hu, Y.C., Lu, H., Cox, A.L., Zwaenepoel, W.: OpenMP for Networks of SMPs. Journal of Parallel Distributed Computing 60, 1512–1530 (2000)
Huang, L., Chapman, B., Kendall, R.: OpenMP for Clusters. In: Proceedings of the Fifth European Workshop on OpenMP (EWOMP 2003), Aachen, Germany, September 22-26 (2003)
Hwang, Y.-S., Moon, B., Sharma, S.D., Ponnusamy, R., Das, R., Saltz, J.H.: Run-time and Language Support for Compiling Adaptive Irregular Problems on Distributed Memory Machines. Software Practice and Experience 25(6), 597–621 (1995)
Labarta, J., Ayguadé, E., Oliver, J., Henty, D.: New OpenMP Directives for Irregular Data Access Loops. In: 2nd European Workshop on OpenMP (EWOMP 2000), Edimburgh (UK) (September 2000)
Liu, Z., Chapman, B.M., Weng, T.-H., Hernandez, O.: Improving the Performance of OpenMP by Array Privatization. In: WOMPAT 2002, pp. 244–259 (2002)
Merlin, J.: Distributed OpenMP: Extensions to OpenMP for SMP Clusters. In: 2nd European Workshop on OpenMP (EWOMP 2000), Edimburgh (UK) (September 2000)
Nieplocha, J., Harrison, R.J., Littlefield, R.J.: Global Arrays: A non-uniform memory access programming model for high-performance computers. The Journal of Supercomputing 10, 197–220 (1996)
Open64 Compiler Tools, http://open64.sourceforge.net/
Saltz, J., Berryman, H., Wu, J.: Multiprocessors and Run-Time Compilation. Concurrency: Practice and Experience 3(6), 573–592 (1991)
Sato, M., Harada, H., Hasegawa, A., Ishikawa, Y.: Cluster-Enabled OpenMP: An OpenMP Compiler for SCASH Software Distributed Share Memory System. Scientific Programming, Special Issue: OpenMP, 9(2-3), 123–130 (2001)
Silicon Graphics Inc. MIPSpro 7 FORTRAN 90 Commands and Directives Reference Manual, Ch. 5: Parallel Processing on Origin Series Systems. Documentation number 007-3696-003, http://techpubs.sgi.com
Top 500 Supercomputer Sites, http://www.top500.org/
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Liu, Z., Huang, L., Chapman, B., Weng, TH. (2005). Efficient Implementation of OpenMP for Clusters with Implicit Data Distribution. In: Chapman, B.M. (eds) Shared Memory Parallel Programming with Open MP. WOMPAT 2004. Lecture Notes in Computer Science, vol 3349. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-31832-3_11
Download citation
DOI: https://doi.org/10.1007/978-3-540-31832-3_11
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-24560-5
Online ISBN: 978-3-540-31832-3
eBook Packages: Computer ScienceComputer Science (R0)