Extending Synchronization Constructs in OpenMP to Exploit Pipeline Parallelism on Heterogeneous Multi-core

Li, Shigang; Yao, Shucai; He, Haohu; Sun, Lili; Chen, Yi; Peng, Yunfeng

doi:10.1007/978-3-642-24669-2_6

Extending Synchronization Constructs in OpenMP to Exploit Pipeline Parallelism on Heterogeneous Multi-core

Shigang Li¹⁹,
Shucai Yao¹⁹,
Haohu He¹⁹,
Lili Sun¹⁹,
Yi Chen¹⁹ &
…
Yunfeng Peng¹⁹

Conference paper

1242 Accesses
1 Citations

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 7017))

Abstract

The ability of expressing multiple-levels of parallelism is one of the significant features in OpenMP parallel programming model. However, pipeline parallelism is not well supported in OpenMP. This paper proposes extensions to OpenMP directives, aiming at expressing pipeline parallelism effectively. The extended directives are divided into two groups. One can define the precedence at thread level while the other can define the precedence at iteration level. Through these directives, programmers can establish pipeline model more easily and exploit more parallelism to improve performance. To support these directives, a set of runtime interfaces for synchronization are implemented on the Cell heterogeneous multi-core architecture using signal block communications mechanism. Experimental results indicate that good performance can be obtained from the pipeline scheme proposed in this paper compared to the naive parallel applications.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

OpenMP Application Program Interface, Version 3.0. OpenMP Architecture Review Board (2008)
Google Scholar
Gonzalez, M., Ayguadé, E., Martorell, X., Labarta, J.: Defining and supporting pipelined executions in OpenMP. In: Eigenmann, R., Voss, M.J. (eds.) WOMPAT 2001. LNCS, vol. 2104, pp. 155–169. Springer, Heidelberg (2001)
Chapter Google Scholar
Rangan, R., Vachharajani, N., Vachharajani, M., August, D.I.: Decoupled software pipelining with the synchronization array. In: Proceedings of the 13th International Conference on Parallel Architectures and Compilation Techniques (PACT), pp. 177–188. IEEE Press, ashington, DC (2004)
Google Scholar
Syrivelis, D., Lalis, S.: Extracting coarse-grained pipelined parallelism out of sequential applications for parallel processor arrays. In: Berekovic, M., Müller-Schloer, C., Hochberger, C., Wong, S. (eds.) ARCS 2009. LNCS, vol. 5455, pp. 4–15. Springer, Heidelberg (2009)
Chapter Google Scholar
Michailidis, P.D., Margaritis, K.G.: Implementing parallel LU factorization with pipelining on a multicore using OpenMP. In: 13th IEEE International Conference on Computational Science and Engineering, pp. 253–260 (2010)
Google Scholar
Baudisch, D., Brandt, J., Schneider, K.: Multithreaded code from synchronous programs: Generating software pipelines for OpenMP. In: Methoden und Beschreibungssprachen zur Modellierung und Verifikation (MBMV), Dresden, Germany (2010)
Google Scholar
Kurzak, J., Dongarra, J.: QR factorization for the CELL processor. Scientific Programming 17, 31–42 (2009)
Article Google Scholar
Baudisch, D., Brandt, J., Schneider, K.: Multithreaded code from synchronous programs: Extracting independent threads for OpenMP. In: Design, Automation and Test in Europe (DATE), pp. 949–952. European Design and Automation Association (2010)
Google Scholar
Teruel, X., Unnikrishnan, P., Martorell, X., et al.: Openmp tasks in ibm XL compilers. In: Proc. of the 2008 Conference of the Center for Advanced Studies on Collaborative Research, pp. 207–221. ACM Press, New York (2008)
Google Scholar
Gschwind, M.: Chip multiprocessing and the cell broadband engine. In: CF 2006: Proceedings of the 3rd Conference on Computing Frontiers, pp. 1–8 (2006)
Google Scholar
Thies, W., Chandrasekhar, V., Amarasinghe, S.: A practical approach to exploiting coarse-grained pipeline parallelism in C programs. In: Proceedings of the 40th Annual IEEE/ACM International Symposium on Microarchitecture, pp. 356–369. IEEE Press, Washington, DC (2007)
Google Scholar
Ottoni, G., Rangan, R., Stoler, A., August, D.I.: Automatic thread extraction with decoupled software pipelining. In: Proceedings of the 38th IEEE/ACM International Symposium on Microarchitecture, pp. 105–118. IEEE Press, Washington, DC (2005)
Google Scholar
Gordon, M.I., Thies, W., Amarasinghe, S.: Exploiting coarse-grained task, data, and pipeline parallelism in stream programs. In: Proceedings of the 12th International Conference on Architectural Support for Programming Languages and Operating Systems, pp. 151–162. ACM, New York (2006)
Google Scholar
Jin, H., Frumkin, M., Yan, J.: The OpenMP implementation of NAS parallel benchmarks and its performance. NAS Technical Report NAS-99-011, NASA Ames Research Center, Moffett Field, CA(1999)
Google Scholar
Ayguade, E., Copty, N., Duran, A., Hoeflinger, J., et al.: A proposal for task parallelism in OpenMP. In: Chapman, B., Zheng, W., Gao, G.R., Sato, M., Ayguadé, E., Wang, D. (eds.) IWOMP 2007. LNCS, vol. 4935, pp. 1–12. Springer, Heidelberg (2008)
Chapter Google Scholar
Ayguade, E., Martorell, X., Labarta, J., Gonzalez, M., Navarro, N.: Exploiting multiple levels of parallelism in OpenMP: a case study. In: 1999 International Conference on Parallel Processing (ICPP), pp. 172–180 (1999)
Google Scholar
Suess, M., Leopold, C.: Implementing data-parallel patterns for shared memory with OpenMP. In: Proceedings of the International Conference on Parallel Computing (PARCO). IOS Press, Amsterdam (2008)
Google Scholar
Cao, Q., Hu, C., He, H., Huang, X., Li, S.: Support for OpenMP tasks on cell architecture. In: Hsu, C.-H., Yang, L.T., Park, J.H., Yeo, S.-S. (eds.) ICA3PP 2010. LNCS, vol. 6082, pp. 308–317. Springer, Heidelberg (2010)
Chapter Google Scholar
Altevogt, P., Boettiger, H., Kiss, T., et al: IBM BladeCenter QS21 hardware performance, IBM Technical White Paper WP101245 [R], USA (2008)
Google Scholar
SPEC: Standard Performance Evaluation Corporation, http://www.spec.org

Download references

Author information

Authors and Affiliations

University of Science and Technology Beijing, 100083, Beijing, China
Shigang Li, Shucai Yao, Haohu He, Lili Sun, Yi Chen & Yunfeng Peng

Authors

Shigang Li
View author publications
You can also search for this author in PubMed Google Scholar
Shucai Yao
View author publications
You can also search for this author in PubMed Google Scholar
Haohu He
View author publications
You can also search for this author in PubMed Google Scholar
Lili Sun
View author publications
You can also search for this author in PubMed Google Scholar
Yi Chen
View author publications
You can also search for this author in PubMed Google Scholar
Yunfeng Peng
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

School of Information Technology, Deakin University, Melbourne Burwood Campus, 221 Burwood Highway, 3125, Burwood, VIC, Australia
Yang Xiang & Wanlei Zhou &
ICAR-CNR and University of Calabria, Via P. Bucci 41 C, 87036, Rende, CS, Italy
Alfredo Cuzzocrea
School of Information Technology, Deakin University, Geelong Waurn Ponds Campus, Pigdons Road, 3217, Geelong, VIC, Australia
Michael Hobbs

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Li, S., Yao, S., He, H., Sun, L., Chen, Y., Peng, Y. (2011). Extending Synchronization Constructs in OpenMP to Exploit Pipeline Parallelism on Heterogeneous Multi-core. In: Xiang, Y., Cuzzocrea, A., Hobbs, M., Zhou, W. (eds) Algorithms and Architectures for Parallel Processing. ICA3PP 2011. Lecture Notes in Computer Science, vol 7017. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-24669-2_6

Download citation

DOI: https://doi.org/10.1007/978-3-642-24669-2_6
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-24668-5
Online ISBN: 978-3-642-24669-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics