Abstract
The ability of expressing multiple-levels of parallelism is one of the significant features in OpenMP parallel programming model. However, pipeline parallelism is not well supported in OpenMP. This paper proposes extensions to OpenMP directives, aiming at expressing pipeline parallelism effectively. The extended directives are divided into two groups. One can define the precedence at thread level while the other can define the precedence at iteration level. Through these directives, programmers can establish pipeline model more easily and exploit more parallelism to improve performance. To support these directives, a set of runtime interfaces for synchronization are implemented on the Cell heterogeneous multi-core architecture using signal block communications mechanism. Experimental results indicate that good performance can be obtained from the pipeline scheme proposed in this paper compared to the naive parallel applications.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
OpenMP Application Program Interface, Version 3.0. OpenMP Architecture Review Board (2008)
Gonzalez, M., Ayguadé, E., Martorell, X., Labarta, J.: Defining and supporting pipelined executions in OpenMP. In: Eigenmann, R., Voss, M.J. (eds.) WOMPAT 2001. LNCS, vol. 2104, pp. 155–169. Springer, Heidelberg (2001)
Rangan, R., Vachharajani, N., Vachharajani, M., August, D.I.: Decoupled software pipelining with the synchronization array. In: Proceedings of the 13th International Conference on Parallel Architectures and Compilation Techniques (PACT), pp. 177–188. IEEE Press, ashington, DC (2004)
Syrivelis, D., Lalis, S.: Extracting coarse-grained pipelined parallelism out of sequential applications for parallel processor arrays. In: Berekovic, M., Müller-Schloer, C., Hochberger, C., Wong, S. (eds.) ARCS 2009. LNCS, vol. 5455, pp. 4–15. Springer, Heidelberg (2009)
Michailidis, P.D., Margaritis, K.G.: Implementing parallel LU factorization with pipelining on a multicore using OpenMP. In: 13th IEEE International Conference on Computational Science and Engineering, pp. 253–260 (2010)
Baudisch, D., Brandt, J., Schneider, K.: Multithreaded code from synchronous programs: Generating software pipelines for OpenMP. In: Methoden und Beschreibungssprachen zur Modellierung und Verifikation (MBMV), Dresden, Germany (2010)
Kurzak, J., Dongarra, J.: QR factorization for the CELL processor. Scientific Programming 17, 31–42 (2009)
Baudisch, D., Brandt, J., Schneider, K.: Multithreaded code from synchronous programs: Extracting independent threads for OpenMP. In: Design, Automation and Test in Europe (DATE), pp. 949–952. European Design and Automation Association (2010)
Teruel, X., Unnikrishnan, P., Martorell, X., et al.: Openmp tasks in ibm XL compilers. In: Proc. of the 2008 Conference of the Center for Advanced Studies on Collaborative Research, pp. 207–221. ACM Press, New York (2008)
Gschwind, M.: Chip multiprocessing and the cell broadband engine. In: CF 2006: Proceedings of the 3rd Conference on Computing Frontiers, pp. 1–8 (2006)
Thies, W., Chandrasekhar, V., Amarasinghe, S.: A practical approach to exploiting coarse-grained pipeline parallelism in C programs. In: Proceedings of the 40th Annual IEEE/ACM International Symposium on Microarchitecture, pp. 356–369. IEEE Press, Washington, DC (2007)
Ottoni, G., Rangan, R., Stoler, A., August, D.I.: Automatic thread extraction with decoupled software pipelining. In: Proceedings of the 38th IEEE/ACM International Symposium on Microarchitecture, pp. 105–118. IEEE Press, Washington, DC (2005)
Gordon, M.I., Thies, W., Amarasinghe, S.: Exploiting coarse-grained task, data, and pipeline parallelism in stream programs. In: Proceedings of the 12th International Conference on Architectural Support for Programming Languages and Operating Systems, pp. 151–162. ACM, New York (2006)
Jin, H., Frumkin, M., Yan, J.: The OpenMP implementation of NAS parallel benchmarks and its performance. NAS Technical Report NAS-99-011, NASA Ames Research Center, Moffett Field, CA(1999)
Ayguade, E., Copty, N., Duran, A., Hoeflinger, J., et al.: A proposal for task parallelism in OpenMP. In: Chapman, B., Zheng, W., Gao, G.R., Sato, M., Ayguadé, E., Wang, D. (eds.) IWOMP 2007. LNCS, vol. 4935, pp. 1–12. Springer, Heidelberg (2008)
Ayguade, E., Martorell, X., Labarta, J., Gonzalez, M., Navarro, N.: Exploiting multiple levels of parallelism in OpenMP: a case study. In: 1999 International Conference on Parallel Processing (ICPP), pp. 172–180 (1999)
Suess, M., Leopold, C.: Implementing data-parallel patterns for shared memory with OpenMP. In: Proceedings of the International Conference on Parallel Computing (PARCO). IOS Press, Amsterdam (2008)
Cao, Q., Hu, C., He, H., Huang, X., Li, S.: Support for OpenMP tasks on cell architecture. In: Hsu, C.-H., Yang, L.T., Park, J.H., Yeo, S.-S. (eds.) ICA3PP 2010. LNCS, vol. 6082, pp. 308–317. Springer, Heidelberg (2010)
Altevogt, P., Boettiger, H., Kiss, T., et al: IBM BladeCenter QS21 hardware performance, IBM Technical White Paper WP101245 [R], USA (2008)
SPEC: Standard Performance Evaluation Corporation, http://www.spec.org
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Li, S., Yao, S., He, H., Sun, L., Chen, Y., Peng, Y. (2011). Extending Synchronization Constructs in OpenMP to Exploit Pipeline Parallelism on Heterogeneous Multi-core. In: Xiang, Y., Cuzzocrea, A., Hobbs, M., Zhou, W. (eds) Algorithms and Architectures for Parallel Processing. ICA3PP 2011. Lecture Notes in Computer Science, vol 7017. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-24669-2_6
Download citation
DOI: https://doi.org/10.1007/978-3-642-24669-2_6
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-24668-5
Online ISBN: 978-3-642-24669-2
eBook Packages: Computer ScienceComputer Science (R0)