Skip to main content

A Proposal for Supporting Speculation in the OpenMP taskloop Construct

  • Conference paper
  • First Online:
OpenMP: Conquering the Full Hardware Spectrum (IWOMP 2019)

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 11718))

Included in the following conference series:

Abstract

Parallelization constructs in OpenMP, such as parallel for or taskloop, are typically restricted to loops that have no loop-carried dependencies (DOALL) or that contain well-known structured dependence patterns (e.g. reduction). These restrictions prevent the parallelization of many computational intensive may DOACROSS loops. In such loops, the compiler cannot prove that the loop is free of loop-carried dependencies, although they may not exist at runtime. This paper proposes a new clause for taskloop that enables speculative parallelization of may DOACROSS loops: the tls clause. We also present an initial evaluation that reveals that: (a) for certain loops, slowdowns using DOACROSS techniques can be transformed in speed-ups of up to \(2.14\times \) by applying speculative parallelization of tasks; and (b) the scheduling of tasks implemented in the Intel OpenMP runtime exacerbates the ratio of order inversion aborts after applying the taskloop-tls parallelization to a loop.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 59.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 74.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Small \(\%lc\). ordered results in performance degradation respect to serial execution for these loops [10].

  2. 2.

    Clang 4.0 was adapted to generate AST to support the new clause as explained in Sect. 3.

  3. 3.

    Speculative privatizations described in [19] were implemented manually.

References

  1. Aldea, S., Estebanez, A., Llanos, D.R., Gonzalez-Escribano, A.: An OpenMP extension that supports thread-level speculation. IEEE Trans. Parallel Distrib. Syst. 27(1), 78–91 (2016)

    Article  Google Scholar 

  2. Ayguade, E., et al.: The design of OpenMP tasks. IEEE Trans. Parallel Distrib. Syst. (TPDS) 20(3), 404–418 (2009)

    Article  Google Scholar 

  3. Cytron, R.: Doacross: beyond vectorization for multiprocessors. In: International Conference on Parallel Processing (ICPP), pp. 836–844 (1986)

    Google Scholar 

  4. Etsion, Y., et al.: Task superscalar: an out-of-order task pipeline. In: International Symposium on Microarchitecture, Washington, DC, USA, pp. 89–100 (2010)

    Google Scholar 

  5. cTuning Foundation: Cbench: collective benchmarks (2016). http://ctuning.org/cbench

  6. Herlihy, M., Moss, J.E.: Transactional memory: architectural support for lock-free data structures. In: International Symposium on Computer Architecture (ISCA), San Diego, CA, USA, pp. 289–300, May 1993

    Google Scholar 

  7. IBM: IBM XL C/C++ for Blue Gene/Q, V12.1 Compiler Reference (2012). http://www-01.ibm.com/support/docview.wss?uid=swg27027065&aid=1

  8. Intel Corporation: Intel architecture instruction set extensions programming reference. Chapter 8: Intel transactional synchronization extensions (2012)

    Google Scholar 

  9. Lamport, L.: The parallel execution of do loops. Commun. ACM 17(2), 83–93 (1974)

    Article  MathSciNet  Google Scholar 

  10. Mattos, L., Cesar, D., Salamanca, J., de Carvalho, J.P.L., Pereira, M., Araujo, G.: Doacross parallelization based on component annotation and loop-carried probability. In: International Symposium on Computer. Architecture and High Performance Computing (SBAC-PAD), Lyon, France, pp. 29–32 (2018)

    Google Scholar 

  11. Moore, K.E., Bobba, J., Moravan, M.J., Hill, M.D., Wood, D.A.: LogTM: log-based transactional memory. In: High-Performance Computer Architecture (HPCA), pp. 254–265 (2006)

    Google Scholar 

  12. Murphy, N., Jones, T., Mullins, R., Campanoni, S.: Performance implications of transient loop-carried data dependences in automatically parallelized loops. In: International Conference on Compiler Construction (CC), Barcelona, Spain, pp. 23–33 (2016)

    Google Scholar 

  13. OpenMP-ARB: OpenMP application program interface version 4.5 (2015)

    Google Scholar 

  14. OpenMP-ARB: OpenMP application program interface version 5.0 (2018)

    Google Scholar 

  15. Ottoni, G., Rangan, R., Stoler, A., August, D.I.: Automatic thread extraction with decoupled software pipelining. In: International Symposium on Microarchitecture (MICRO), p. 12, November 2005

    Google Scholar 

  16. Perez, J.M., Badia, R.M., Labarta, J.: A dependency-aware task-based programming environment for multi-core architectures. In: 2008 IEEE International Conference on Cluster Computing, Tsukuba, Japan, pp. 142–151 (2008)

    Google Scholar 

  17. Podobas, A., Karlsson, S.: Towards unifying OpenMP under the task-parallel paradigm. In: International Workshop on OpenMP (IWOMP), Nara, Japan, pp. 116–129 (2016)

    Chapter  Google Scholar 

  18. Salamanca, J., Amaral, J.N., Araujo, G.: Evaluating and improving thread-level speculation in hardware transactional memories. In: IEEE International Parallel and Distributed Processing Symposium (IPDPS), Chicago, USA, pp. 586–595 (2016)

    Google Scholar 

  19. Salamanca, J., Amaral, J.N., Araujo, G.: Using hardware-transactional-memory support to implement thread-level speculation. IEEE Trans. Parallel Distrib. Syst. 29(2), 466–480 (2018)

    Article  Google Scholar 

  20. Salamanca, J., Amaral, J.N., Araujo, G.: Performance evaluation of thread-level speculation in off-the-shelf hardware transactional memories. In: Rivera, F.F., Pena, T.F., Cabaleiro, J.C. (eds.) Euro-Par 2017. LNCS, vol. 10417, pp. 607–621. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-64203-1_44

    Chapter  Google Scholar 

  21. Sohi, G.S., Breach, S.E., Vijaykumar, T.N.: Multiscalar processors. In: International Symposium on Computer Architecture (ISCA), Santa Margherita Ligure, Italy, pp. 414–425 (1995)

    Google Scholar 

  22. Steffan, J., Mowry, T.: The potential for using thread-level data speculation to facilitate automatic parallelization. In: High-Performance Computer Architecture (HPCA), Washington, USA, pp. 2–13 (1998)

    Google Scholar 

  23. Steffan, J.G., Colohan, C.B., Zhai, A., Mowry, T.C.: A scalable approach to thread-level speculation. In: International Conference on Computer Architecture (ISCA), Vancouver, British Columbia, Canada, pp. 1–12 (2000)

    Google Scholar 

  24. Teruel, X., Klemm, M., Li, K., Martorell, X., Olivier, S.L., Terboven, C.: A proposal for task-generating loops in OpenMP*. In: International Workshop on OpenMP (IWOMP), Camberra, Australia (2013)

    Google Scholar 

  25. Torrellas, J.: Speculation, thread-level. In: Padua, D. (ed.) Encyclopedia of Parallel Computing, pp. 1894–1900. Springer, Boston (2011). https://doi.org/10.1007/978-0-387-09766-4_170

    Chapter  Google Scholar 

Download references

Acknowledgments

The authors would like to thank the anonymous reviewers for the insightful comments. This work is supported by FAPESP (grants 18/07446-8 and 18/15519-5).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Juan Salamanca .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Salamanca, J., Baldassin, A. (2019). A Proposal for Supporting Speculation in the OpenMP taskloop Construct. In: Fan, X., de Supinski, B., Sinnen, O., Giacaman, N. (eds) OpenMP: Conquering the Full Hardware Spectrum. IWOMP 2019. Lecture Notes in Computer Science(), vol 11718. Springer, Cham. https://doi.org/10.1007/978-3-030-28596-8_17

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-28596-8_17

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-28595-1

  • Online ISBN: 978-3-030-28596-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics