Skip to main content

A Prototype Implementation of OpenMP Task Dependency Support

  • Conference paper
OpenMP in the Era of Low Power Devices and Accelerators (IWOMP 2013)

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 8122))

Included in the following conference series:

Abstract

OpenMP 3.0 introduced the concept of asynchronous tasks, independent units of work that may be dynamically created and scheduled. Task synchronization is accomplished via the insertion of taskwait and barrier constructs. However, the inappropriate use of these constructs may incur significant overhead owing to global synchronizations for specific algorithms on large platforms. The performance of such algorithms may benefit substantially if a mechanism of specifying finer gained point-to-point synchronization between tasks is available. In this paper we present extensions to the current OpenMP task directive to enable the specification of dependencies among tasks. A task waits only until the explicit dependencies as specified by the programmer are satisfied, thereby enabling support for a dataflow model within OpenMP. We evaluate the extensions implemented in the OpenUH OpenMP compiler using LU decomposition and Smith-Waterman algorithms. By applying the extensions to the two algorithms, we demonstrate significant performance improvement over the standard tasking versions. When comparing our results with those obtained using related dataflow models - OmpSs and QUARK, we observed that the versions using our task extensions delivered an average speedup of 2-6x.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 49.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Intel Concurrent Collections, http://software.intel.com/en-us/articles/intel-concurrent-collections-for-cc/

  2. OpenMP 4.0 release candidate 2, http://www.openmp.org/mp-documents/OpenMP_4.0_RC2.pdf/

  3. Agullo, E., Demmel, J., Dongarra, J., Hadri, B., Kurzak, J., Langou, J., Ltaief, H., Luszczek, P., Tomov, S.: Numerical linear algebra on emerging architectures: The plasma and magma projects. In: Journal of Physics: Conference Series. vol. 180, p. 012037. IOP Publishing (2009)

    Google Scholar 

  4. Chapman, B., Eachempati, D., Hernandez, O.: Experiences developing the openuh compiler and runtime infrastructure. International Journal of Parallel Programming, 1–30 (2012)

    Google Scholar 

  5. Dallou, T., Juurlink, B.: Hardware-based task dependency resolution for the starss programming model. In: 2012 41st International Conference on Parallel Processing Workshops (ICPPW), pp. 367–374. IEEE (2012)

    Google Scholar 

  6. Desprez, F., Domas, S., Tourancheau, B.: Optimization of the scalapack lu factorization routine using communication/computation overlap. In: Euro-Par 1996 Parallel Processing, pp. 1–10. Springer (1996)

    Google Scholar 

  7. Dios, A.J., Asenjo, R., Navarro, A., Corbera, F., Zapata, E.L.: Evaluation of the task programming model in the parallelization of wavefront problems. In: 2010 12th IEEE International Conference on High Performance Computing and Communications (HPCC), pp. 257–264. IEEE (2010)

    Google Scholar 

  8. Duran, A., Perez, J.M., Ayguadé, E., Badia, R.M., Labarta, J.: Extending the openMP tasking model to allow dependent tasks. In: Eigenmann, R., de Supinski, B.R. (eds.) IWOMP 2008. LNCS, vol. 5004, pp. 111–122. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  9. Ghosh, P., Yan, Y., Chapman, B.: Support for dependency driven executions among openmp tasks. In: Workshop on Data-Flow Execution Models for Extreme Scale Computing (DFM 2012) in conjunction with PACT (September 2012)

    Google Scholar 

  10. Haidar, A., Ltaief, H., Luszczek, P., Dongarra, J.: A comprehensive study of task coalescing for selecting parallelism granularity in a two-stage bidiagonal reduction. In: 2012 IEEE 26th International Parallel & Distributed Processing Symposium (IPDPS), pp. 25–35. IEEE (2012)

    Google Scholar 

  11. Olivier, S.L., de Supinski, B.R., Schulz, M., Prins, J.F.: Characterizing and mitigating work time inflation in task parallel programs. In: Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis, SC 2012, pp. 65:1–65:12. IEEE Computer Society Press, Los Alamitos (2012)

    Google Scholar 

  12. Taşırlar, S., Sarkar, V.: Data-Driven Tasks and their Implementation. In: Proceedings of the International Conference on Parallel Processing (September 2011)

    Google Scholar 

  13. Vajracharya, S., Karmesin, S., Beckman, P., Crotinger, J., Malony, A., Shende, S., Oldehoeft, R., Smith, S.: Smarts: Exploiting temporal locality and parallelism through vertical execution. In: Proceedings of the 13th International Conference on Supercomputing, pp. 302–310. ACM (1999)

    Google Scholar 

  14. Weng, T.H.: Translation of OpenMP to Dataflow Execution Model for Data locality and Efficient Parallel Execution. PhD thesis, Department of Computer Science, University of Houston (2003)

    Google Scholar 

  15. Yan, Y., Chatterjee, S., Orozco, D.A., Garcia, E., Budimlić, Z., Shirako, J., Pavel, R.S., Gao, G.R., Sarkar, V.: Hardware and software tradeoffs for task synchronization on manycore architectures. In: Jeannot, E., Namyst, R., Roman, J. (eds.) Euro-Par 2011, Part II. LNCS, vol. 6853, pp. 112–123. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  16. YarKhan, A., Kurzak, J., Dongarra, J.: Quark users guide: Queueing and runtime for kernels. University of Tennessee Innovative Computing Laboratory Technical Report ICL-UT-11-02 (2011)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Ghosh, P., Yan, Y., Eachempati, D., Chapman, B. (2013). A Prototype Implementation of OpenMP Task Dependency Support. In: Rendell, A.P., Chapman, B.M., Müller, M.S. (eds) OpenMP in the Era of Low Power Devices and Accelerators. IWOMP 2013. Lecture Notes in Computer Science, vol 8122. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-40698-0_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-40698-0_10

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-40697-3

  • Online ISBN: 978-3-642-40698-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics