Abstract
Many researchers have focused on developing the techniques for the situation where data arrays are indexed through indirection arrays. However, these techniques may be ineffective for nonlinear indexing. In this paper, we propose extensions to OpenMP directives, aiming at efficient irregular OpenMP codes including nonlinear indexing to be executed in parallel. Furthermore, some optimization techniques for irregular computing are presented. These techniques include generation of communication sets and SPMD code, communication scheduling strategy, and low overhead locality transformation scheme. Finally, experimental results are presented to validate our extensions and optimization techniques.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Saltz, J., Ponnusamy, R.D., Sharma, S., Moon, B., Hwang, Y.S., Uysal, M., Das, R.: A Manual for the CHAOS Runtime Library, UMI-ACS, University of Manyland (1994)
Chakrabarti, D.R., Banerjee, P., Lain, A.: Evaluation of Compiler and Runtime Library Approaches for Supporting Parallel Regular Applications. In: Proc. of the 12th International Parallel Processing Symposium on International Parallel Processing Symposium, pp. 74–80 (1998)
OpenMP application program interface, ver 2.5, Tech. report (May 2005), http://www.openmp.org/
Chapman, B., Bregier, F., Patil, A., Prabhakar, A.: Achieving Performance under OpenMP on ccNUMA and Software Distributed Shared Memory Systems. Special Issue of Concurrency Practice and Experience, 713–739 (2002)
Min, S.J., Basumallik, A., Eigenmann, R.: Optimizing OpenMP Programs on Software Distributed Shared Memory Systems. Int. J. Paral. Prog. 31(3), 225–249 (2003)
Basumallik, A., Eigenmann, R.: Towards Automatic Translation of OpenMP to MPI. In: Proc. of the 19th ACM Int’l Conference on Supercomputing (ICS), Boston, pp. 189–198 (2005)
Basumallik, A., Eigenmann, R.: Optimizing Irregular Shared-Memory Applications for Distributed-Memory Systems. In: Proc. of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP), New York, pp. 119–128 (2006)
Guo, M., Cao, J., Chang, W., Li, L., Liu, C.: Effective OpenMP Extensions for Irregular Applications on Cluster Environments. In: Li, M., Sun, X.-H., Deng, Q.-n., Ni, J. (eds.) GCC 2003. LNCS, vol. 3033, pp. 97–104. Springer, Heidelberg (2004)
Guo, M.: Automatic Parallelization and Optimization for Irregular Scientific Applications. In: Proc. of the 18th International Parallel and Distributed Processing Symposium (2004)
Wang, J., Hu, C., Lai, J., Zhao, Y., Zhang, S.: Multi-paradigm and Multi-grain Parallel Model Based on SMP-Cluster. In: Proc. of IEEE 2006 John Vincent Atanasoff International Symposium on Modern Computing. IEEE Society Press, Los Alamitos (2006)
Yongjian, C., Jianjiang, L., Shengyuan, W., Dingxing, W.: ORC-OpenMP: An OpenMP compiler based on ORC. In: Voss, M. (ed.) Proc. Of the International Conference on Computational Science, pp. 414–423. Springer, Heidelberg (2004)
Berry, M., Chen, D., Koss, P., Kuck, D., Lo, S., Pang, Y., Roloff, R., Sameh, A., Clementi, E., Chin, S., Schneider, D., Fox, G., Messina, P., Walker, D., Hsiung, C., Schwarzmeier, J., Lue, K., Orzag, S., Seidl, F., Johnson, O., Swanson, G., Goodrum, R., Martin, J.: The PERFECT club benchmarks: effective performance evaluation of supercomputers. International Journal of Supercomputing Applications 3(3), 5–40 (1989)
Engelen, R., Birch, J., Shou, Y., Walsh, B., Gallivan, K.: A Unified Framework for Nonlinear Dependence Testing and Symbolic Analysis. In: Proc. of the ACM International Conference on Supercomputing, pp. 106–115 (2004)
Hu, C., Li, J., Wang, J., Li, Y.H., Ding, L., Li, J.J.: Communicate generation for irregular parallel applications. In: Proc. IEEE International Symposium on Parallel Computing in Electrical Engineering, Bialystok, Poland, IEEE Society Press, Los Alamitos (2006)
Tseng, E.H.-Y., Gaudlot, J.-L.: Communication generation for aligned and cyclic(k) distributions using integer lattice. IEEE Transactions on Parallel and Distributed Systems 10(2), 136–146 (1999)
Faraj, A., Yuan, X., Patarasuk, P.: A Message scheduling scheme for All-to-all personalized communication on Ethernet switched cluster. IEEE Trans. Parallel Distrib. Systems (2006)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Wang, J., Hu, C., Zhang, J., Li, J. (2008). OpenMP Extensions for Irregular Parallel Applications on Clusters. In: Chapman, B., Zheng, W., Gao, G.R., Sato, M., Ayguadé, E., Wang, D. (eds) A Practical Programming Model for the Multi-Core Era. IWOMP 2007. Lecture Notes in Computer Science, vol 4935. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-69303-1_9
Download citation
DOI: https://doi.org/10.1007/978-3-540-69303-1_9
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-69302-4
Online ISBN: 978-3-540-69303-1
eBook Packages: Computer ScienceComputer Science (R0)