Skip to main content

Extending OpenMP for Machine Learning-Driven Adaptation

  • Conference paper
  • First Online:
Accelerator Programming Using Directives (WACCPD 2021)

Abstract

OpenMP 5.0 introduced the directive to support compile-time selection from a set of directive variants based on OpenMP context. OpenMP 5.1 extended context information to include user-defined conditions that enable user-guided runtime adaptation. However, defining conditions that capture the complex interactions between applications and hardware platforms to select an optimized variant is challenging for programmers. This paper explores a novel approach to automate runtime adaptation through machine learning. We design a new directive to describe semantics for model-driven adaptation and also develop a prototype implementation. Using the Smith-Waterman algorithm as a use-case, our experiments demonstrate that the proposed adaptive OpenMP extension automatically chooses the code variants that deliver the best performance in heterogeneous platforms that consist of CPU and GPU processing capabilities. Using decision tree models for tuning has an accuracy of up to 93.1% in selecting the optimal variant, with negligible runtime overhead.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 44.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 59.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Ansel, J., et al.: OpenTuner: an extensible framework for program autotuning. In: Proceedings of the 23rd International Conference on Parallel Architectures and Compilation, pp. 303–316 (2014)

    Google Scholar 

  2. Ashouri, A.H., Killian, W., Cavazos, J., Palermo, G., Silvano, C.: A survey on compiler autotuning using machine learning. ACM Comput. Surv. (CSUR) 51(5), 1–42 (2018)

    Article  Google Scholar 

  3. Bari, M.A.S., et al.: ARCS: adaptive runtime configuration selection for power-constrained OpenMP applications. In: 2016 IEEE International Conference on Cluster Computing (CLUSTER), pp. 461–470, September 2016. https://doi.org/10.1109/CLUSTER.2016.39

  4. Beckingsale, D., Pearce, O., Laguna, I., Gamblin, T.: Apollo: reusable models for fast, dynamic tuning of input-dependent code. In: 2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS), pp. 307–316. IEEE (2017)

    Google Scholar 

  5. Chen, C., Chame, J., Hall, M.: CHiLL: a framework for composing high-level loop transformations. Technical report, Citeseer (2008)

    Google Scholar 

  6. Chen, R.S., Hollingsworth, J.K.: ANGEL: a hierarchical approach to multi-objective online auto-tuning. In: Proceedings of the 5th International Workshop on Runtime and Operating Systems for Supercomputers, pp. 1–8 (2015)

    Google Scholar 

  7. Culjak, I., Abram, D., Pribanic, T., Dzapo, H., Cifrek, M.: A brief introduction to OpenCV. In: 2012 Proceedings of the 35th International Convention MIPRO, pp. 1725–1730. IEEE (2012)

    Google Scholar 

  8. Cummins, C., Petoumenos, P., Wang, Z., Leather, H.: End-to-end deep learning of optimization heuristics. In: 2017 26th International Conference on Parallel Architectures and Compilation Techniques (PACT), pp. 219–232. IEEE (2017)

    Google Scholar 

  9. Fursin, G., et al.: Milepost GCC: machine learning enabled self-tuning compiler. Int. J. Parallel Prog. 39(3), 296–327 (2011)

    Article  Google Scholar 

  10. Grewe, D., Wang, Z., O’Boyle, M.F.: Portable mapping of data parallel programs to OpenCL for heterogeneous systems. In: Proceedings of the 2013 IEEE/ACM International Symposium on Code Generation and Optimization (CGO), pp. 1–10. IEEE (2013)

    Google Scholar 

  11. Hayashi, A., Ishizaki, K., Koblents, G., Sarkar, V.: Machine-learning-based performance heuristics for runtime CPU/GPU selection. In: Proceedings of the Principles and Practices of Programming on the Java Platform, pp. 27–36 (2015)

    Google Scholar 

  12. Hornung, R.D., Keasler, J.A.: The RAJA portability layer: overview and status (2014)

    Google Scholar 

  13. Liao, C., Quinlan, D.J., Panas, T., de Supinski, B.R.: A ROSE-based OpenMP 3.0 research compiler supporting multiple runtime libraries. In: Sato, M., Hanawa, T., Müller, M.S., Chapman, B.M., de Supinski, B.R. (eds.) IWOMP 2010. LNCS, vol. 6132, pp. 15–28. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-13217-9_2

    Chapter  Google Scholar 

  14. Liao, C., Quinlan, D.J., Vuduc, R., Panas, T.: Effective source-to-source outlining to support whole program empirical optimization. In: Gao, G.R., Pollock, L.L., Cavazos, J., Li, X. (eds.) LCPC 2009. LNCS, vol. 5898, pp. 308–322. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-13374-9_21

    Chapter  Google Scholar 

  15. Liao, C., Yan, Y., de Supinski, B.R., Quinlan, D.J., Chapman, B.: Early experiences with the OpenMP accelerator model. In: Rendell, A.P., Chapman, B.M., Müller, M.S. (eds.) IWOMP 2013. LNCS, vol. 8122, pp. 84–98. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-40698-0_7

    Chapter  Google Scholar 

  16. Luk, C.K., Hong, S., Kim, H.: Qilin: exploiting parallelism on heterogeneous multiprocessors with adaptive mapping. In: 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), pp. 45–55. IEEE (2009)

    Google Scholar 

  17. Nugteren, C., Codreanu, V.: CLTune: a generic auto-tuner for OpenCL kernels. In: 2015 IEEE 9th International Symposium on Embedded Multicore/Many-Core Systems-on-Chip, pp. 195–202. IEEE (2015)

    Google Scholar 

  18. OpenMP Architecture Review Board: OpenMP Application Programming Interface 5.0, November 2018. https://www.openmp.org/wp-content/uploads/OpenMP-API-Specification-5.0.pdf

  19. OpenMP Architecture Review Board: OpenMP Application Programming Interface 5.1, November 2020. https://www.openmp.org/wp-content/uploads/OpenMP-API-Specification-5-1.pdf

  20. Pennycook, S.J., Sewall, J.D., Hammond, J.R.: Evaluating the impact of proposed OpenMP 5.0 features on performance, portability and productivity. In: 2018 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 37–46, November 2018. https://doi.org/10.1109/P3HPC.2018.00007

  21. Quinlan, D., Liao, C.: The ROSE source-to-source compiler infrastructure. In: Cetus Users and Compiler Infrastructure Workshop, in Conjunction with PACT, vol. 2011, p. 1. Citeseer (2011)

    Google Scholar 

  22. Smith, T.F., Waterman, M.S., et al.: Identification of common molecular subsequences. J. Mol. Biol. 147(1), 195–197 (1981)

    Article  Google Scholar 

  23. Sreenivasan, V., Javali, R., Hall, M., Balaprakash, P., Scogland, T.R.W., de Supinski, B.R.: A framework for enabling OpenMP autotuning. In: Fan, X., de Supinski, B.R., Sinnen, O., Giacaman, N. (eds.) IWOMP 2019. LNCS, vol. 11718, pp. 50–60. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-28596-8_4

    Chapter  Google Scholar 

  24. Tapus, C., Chung, I.H., Hollingsworth, J.K.: Active harmony: towards automated performance tuning. In: Proceedings of the 2002 ACM/IEEE Conference on Supercomputing, SC 2002, pp. 1–11. IEEE Computer Society Press, Washington, DC (2002)

    Google Scholar 

  25. Wang, Z., O’Boyle, M.: Machine learning in compiler optimization. Proc. IEEE 106(11), 1879–1901 (2018)

    Article  Google Scholar 

  26. Wood, C., et al.: Artemis: automatic runtime tuning of parallel execution parameters using machine learning. In: Chamberlain, B.L., Varbanescu, A.-L., Ltaief, H., Luszczek, P. (eds.) ISC High Performance 2021. LNCS, vol. 12728, pp. 453–472. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-78713-4_24

    Chapter  Google Scholar 

  27. Yan, Y., Wang, A., Liao, C., Scogland, T.R.W., de Supinski, B.R.: Extending OpenMP Metadirective semantics for runtime adaptation. In: Fan, X., de Supinski, B.R., Sinnen, O., Giacaman, N. (eds.) IWOMP 2019. LNCS, vol. 11718, pp. 201–214. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-28596-8_14

    Chapter  Google Scholar 

  28. Yi, Q., Seymour, K., You, H., Vuduc, R., Quinlan, D.: POET: parameterized optimizations for empirical tuning. In: 2007 IEEE International Parallel and Distributed Processing Symposium, pp. 1–8. IEEE (2007)

    Google Scholar 

Download references

Acknowledgment

This work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344 (LLNL-CONF-826432). The OpenMP language extension work was supported by the U.S. Dept. of Energy, Office of Science, Advanced Scientific Computing Research. The compiler and runtime work were supported by LLNL-LDRD 21-ERD-018.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chunhua Liao .

Editor information

Editors and Affiliations

Artifact Availability Statement

Artifact Availability Statement

Summary of the Experiments Reported: We ran Smith Waterman algorithm on LLNL’s Corona and Pascal supercomputer. The detailed software configurations are given in the experiment section of the paper.

Software Artifact Availability: All author-created software artifacts are maintained in a public repository under an OSI-approved license.

Hardware Artifact Availability: There are no author-created hardware artifacts.

Data Artifact Availability: All author-created data artifacts are maintained in a public repository under an OSI-approved license.

Proprietary Artifacts: None of the associated artifacts, author-created or otherwise, are proprietary.

List of URLs and/or DOIs where artifacts are available:

https://doi.org/10.5281/zenodo.5706501

Rights and permissions

Reprints and permissions

Copyright information

© 2022 Anjia Wang and Yonghong Yan, and Lawrence Livermore National Security, LLC, under exclusive license to Springer Nature Switzerland AG, part of Springer Nature

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Liao, C. et al. (2022). Extending OpenMP for Machine Learning-Driven Adaptation. In: Bhalachandra, S., Daley, C., Melesse Vergara, V. (eds) Accelerator Programming Using Directives. WACCPD 2021. Lecture Notes in Computer Science(), vol 13194. Springer, Cham. https://doi.org/10.1007/978-3-030-97759-7_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-97759-7_3

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-97758-0

  • Online ISBN: 978-3-030-97759-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics