Skip to main content

Negative Perceptions About the Applicability of Source-to-Source Compilers in HPC: A Literature Review

  • Conference paper
  • First Online:
High Performance Computing (ISC High Performance 2021)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 12761))

Included in the following conference series:

  • 1929 Accesses

Abstract

A source-to-source compiler is a type of translator that accepts the source code of a program written in a programming language as its input and produces an equivalent source code in the same or different programming language. S2S techniques are commonly used to enable fluent translation between high-level programming languages, to perform large-scale refactoring operations, and to facilitate instrumentation for dynamic analysis. Negative perceptions about S2S’s applicability in High Performance Computing (HPC) are studied and evaluated here. This is a first study that brings to light reasons why scientists do not use source-to-source techniques for HPC. The primary audience for this paper are those considering S2S technology in their HPC application work.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Adamski, D., Szydłowski, M., Jabłoński, G., Lasoń, J.: Dynamic tiling optimization for polly compiler. Int. J. Microelectron. Comput. Sci. 8(4) (2017)

    Google Scholar 

  2. Ahmed, H., Skjellum, A., Bangalore, P., Pirkelbauer, P.: Transforming blocking MPI collectives to non-blocking and persistent operations. In: Proceedings of the 24th European MPI Users’ Group Meeting. EuroMPI 2017. Association for Computing Machinery, New York (2017). https://doi.org/10.1145/3127024.3127033

  3. Appel, A.W.: Modern Compiler Implementation in C. Cambridge University Press, Cambridge (2004)

    Google Scholar 

  4. Ayres, D.L., Cummings, M.P.: Heterogeneous hardware support in BEAGLE, a high-performance computing library for statistical phylogenetics. In: 2017 46th International Conference on Parallel Processing Workshops (ICPPW), pp. 23–32. IEEE (2017)

    Google Scholar 

  5. Balart, J., Duran, A., Gonzàlez, M., Martorell, X., Ayguadé, E., Labarta, J.: Nanos Mercurium: a research compiler for OpenMP. In: Proceedings of the European Workshop on OpenMP, vol. 8, p. 56 (2004)

    Google Scholar 

  6. Besnard, J.B., et al.: Introducing task-containers as an alternative to runtime-stacking. In: Proceedings of the 23rd European MPI Users’ Group Meeting, pp. 51–63 (2016)

    Google Scholar 

  7. Capodieci, N., Cavicchioli, R., Bertogna, M., Paramakuru, A.: Deadline-based scheduling for GPU with preemption support. In: 2018 IEEE Real-Time Systems Symposium (RTSS), pp. 119–130. IEEE (2018)

    Google Scholar 

  8. Castro, P.D.O., Akel, C., Petit, E., Popov, M., Jalby, W.: Cere: LLVM-based codelet extractor and replayer for piecewise benchmarking and optimization. ACM Trans. Arch. Code Optim. (TACO) 12(1), 1–24 (2015)

    Article  Google Scholar 

  9. Chen, Y.: Software simultaneous multithreading through compilation. Ph.D. thesis, University of Delaware (2018)

    Google Scholar 

  10. Cingolani, D., Pellegrini, A., Schordan, M., Quaglia, F., Jefferson, D.R.: Dealing with reversibility of shared libraries in PDES. In: Proceedings of the 2017 ACM SIGSIM Conference on Principles of Advanced Discrete Simulation. SIGSIM-PADS 2017, pp. 41–52. Association for Computing Machinery, New York (2017). https://doi.org/10.1145/3064911.3064927

  11. Cores, I., Rodríguez, G., González, P., Martín, M.J.: Failure avoidance in MPI applications using an application-level approach. Comput. J. 57(1), 100–114 (2014)

    Article  Google Scholar 

  12. Dave, C., Bae, H., Min, S.J., Lee, S., Eigenmann, R., Midkiff, S.: Cetus: a source-to-source compiler infrastructure for multicores. Computer 42(12), 36–42 (2009)

    Article  Google Scholar 

  13. Degomme, A., Legrand, A., Markomanolis, G.S., Quinson, M., Stillwell, M., Suter, F.: Simulating MPI applications: the SMPI approach. IEEE Trans. Parallel Distrib. Syst. 28(8), 2387–2400 (2017)

    Article  Google Scholar 

  14. Denis, C., Castro, P.D.O., Petit, E.: Verificarlo: checking floating point accuracy through Monte Carlo arithmetic. In: 2016 IEEE 23nd Symposium on Computer Arithmetic (ARITH), pp. 55–62. IEEE (2016)

    Google Scholar 

  15. Diener, M., White, S., Kale, L.V., Campbell, M., Bodony, D.J., Freund, J.B.: Improving the memory access locality of hybrid MPI applications. In: Proceedings of the 24th European MPI Users’ Group Meeting. EuroMPI 2017. Association for Computing Machinery, New York (2017). https://doi.org/10.1145/3127024.3127038

  16. Fukuda, K., Matsuda, M., Maruyama, N., Yokota, R., Taura, K., Matsuoka, S.: Tapas: an implicitly parallel programming framework for hierarchical n-body algorithms. In: 2016 IEEE 22nd International Conference on Parallel and Distributed Systems (ICPADS), pp. 1100–1109. IEEE (2016)

    Google Scholar 

  17. Gosselin, J., Wang, A., Pirkelbauer, P., Liao, C., Yan, Y., Dechev, D.: Extending freecompilercamp.org as an onlineself-learning platform for compiler development. In: Workshop on Education for High Performance Computing (EduHPC-20), November 2020

    Google Scholar 

  18. Gschwandtner, P., Durillo, J.J., Fahringer, T.: Multi-objective auto-tuning with Insieme: optimization and trade-off analysis for time, energy and resource usage. In: Silva, F., Dutra, I., Santos Costa, V. (eds.) Euro-Par 2014. LNCS, vol. 8632, pp. 87–98. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-09873-9_8

    Chapter  Google Scholar 

  19. Harel, R., Mosseri, I., Levin, H., Alon, L.O., Rusanovsky, M., Oren, G.: Source-to-source parallelization compilers for scientific shared-memory multi-core and accelerated multiprocessing: analysis, pitfalls, enhancement and potential. Int. J. Parallel Program. 48(1), 1–31 (2020)

    Google Scholar 

  20. Holland, G.: Abstracting OpenCL for multi-application workloads on CPU-FPGA clusters. Ph.D. thesis, Applied Sciences: School of Engineering Science (2019)

    Google Scholar 

  21. Hollman, D.S., Bennett, J.C., Kolla, H., Lifflander, J., Slattengren, N., Wilke, J.: Metaprogramming-enabled parallel execution of apparently sequential C++ code. In: 2016 Second International Workshop on Extreme Scale Programming Models and Middlewar (ESPM2), pp. 24–31 (2016)

    Google Scholar 

  22. Huck, K.A., Malony, A.D., Shende, S., Jacobsen, D.W.: Integrated measurement for cross-platform OpenMP performance analysis. In: DeRose, L., de Supinski, B.R., Olivier, S.L., Chapman, B.M., Müller, M.S. (eds.) IWOMP 2014. LNCS, vol. 8766, pp. 146–160. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-11454-5_11

    Chapter  Google Scholar 

  23. Jordan, H.: Insieme-a compiler infrastructure for parallel programs. Ph.D. thesis, Ph. D. dissertation, University of Innsbruck (2014)

    Google Scholar 

  24. Khammassi, N.: High-level structured programming models for explicit and automatic parallelization on multicore architectures. Ph.D. thesis, Université de Bretagne Sud (2014)

    Google Scholar 

  25. Kim, J., Lee, Y.J., Park, J., Lee, J.: Translating OpenMP device constructs to OpenCL using unnecessary data transfer elimination. In: SC 2016: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, pp. 597–608. IEEE (2016)

    Google Scholar 

  26. Komatsu, K., Egawa, R., Hirasawa, S., Takizawa, H., Itakura, K., Kobayashi, H.: Migration of an atmospheric simulation code to an OpenACC platform using the xevolver framework. In: 2015 Third International Symposium on Computing and Networking (CANDAR), pp. 515–520. IEEE (2015)

    Google Scholar 

  27. Kruse, M., Grosser, T.: DeLICM: scalar dependence removal at zero memory cost. In: Proceedings of the 2018 International Symposium on Code Generation and Optimization, pp. 241–253 (2018)

    Google Scholar 

  28. Lattner, C.: LLVM and clang: next generation compiler technology. In: The BSD Conference, vol. 5 (2008)

    Google Scholar 

  29. Lattner, C., et al.: MLIR: a compiler infrastructure for the end of Moore’s law. arXiv e-prints, pp. arXiv-2002 (2020)

    Google Scholar 

  30. Li, J., Guo, B., Shen, Y., Li, D., Huang, Y.: Kernel scheduling approach for reducing GPU energy consumption. J. Comput. Sci. 28, 360–368 (2018)

    Article  Google Scholar 

  31. Lidman, J., Quinlan, D.J., Liao, C., McKee, S.A.: Rose::fttransform - a source-to-source translation framework for exascale fault-tolerance research. In: IEEE/IFIP International Conference on Dependable Systems and Networks Workshops (DSN 2012), pp. 1–6 (2012)

    Google Scholar 

  32. Luley, R.S., Qiu, Q.: Effective utilization of CUDA Hyper-Q for improved power and performance efficiency. In: 2016 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 1160–1169. IEEE (2016)

    Google Scholar 

  33. Macià, S., Martínez-Ferrer, P.J., Mateo, S., Beltran, V., Ayguadé, E.: Assembling a high-productivity DSL for computational fluid dynamics. In: Proceedings of the Platform for Advanced Scientific Computing Conference, pp. 1–11 (2019)

    Google Scholar 

  34. Majeti, D., Meel, K.S., Barik, R., Sarkar, V.: Automatic data layout generation and kernel mapping for CPU+GPU architectures. In: Proceedings of the 25th International Conference on Compiler Construction, pp. 240–250 (2016)

    Google Scholar 

  35. Masnada, S.: Semi-automatic performance optimization of HPC kernels. Ph.D. thesis, Université Grenoble Alpes (2016)

    Google Scholar 

  36. McCormick, P., et al.: Exploring the construction of a domain-aware toolchain for high-performance computing. In: 2014 Fourth International Workshop on Domain-Specific Languages and High-Level Frameworks for High Performance Computing, pp. 1–10. IEEE (2014)

    Google Scholar 

  37. Medina, D.: Okl: a unified language for parallel architectures. Technical report, TR15-04, Rice University, June 2015

    Google Scholar 

  38. Menon, H., et al.: Adapt: algorithmic differentiation applied to floating-point precision tuning. In: SC18: International Conference for High Performance Computing, Networking, Storage and Analysis, pp. 614–626. IEEE (2018)

    Google Scholar 

  39. Meyer, X., Chopard, B., Salamin, N.: Scheduling finite difference approximations for DAG-modeled large scale applications. In: Proceedings of the Platform for Advanced Scientific Computing Conference. PASC 2017. Association for Computing Machinery, New York (2017). https://doi.org/10.1145/3093172.3093231

  40. Milewicz, R., Vanka, R., Tuck, J., Quinlan, D., Pirkelbauer, P.: Runtime checking C programs. In: Proceedings of the 30th Annual ACM Symposium on Applied Computing. SAC 2015, pp. 2107–2114. Association for Computing Machinery, New York (2015). https://doi.org/10.1145/2695664.2695906

  41. Milic, U., et al.: Beyond the socket: NUMA-aware GPUs. In: Proceedings of the 50th Annual IEEE/ACM International Symposium on Microarchitecture, pp. 123–135 (2017)

    Google Scholar 

  42. Novillo, D.: SamplePGO - the power of profile guided optimizations without the usability burden. In: 2014 LLVM Compiler Infrastructure in HPC, pp. 22–28 (2014)

    Google Scholar 

  43. Ortega-Arranz, H., Torres, Y., Gonzalez-Escribano, A., Llanos, D.R.: TuCCompi: a multi-layer model for distributed heterogeneous computing with tuning capabilities. Int. J. Parallel Prog. 43(5), 939–960 (2015)

    Article  Google Scholar 

  44. Palkowski, M., Bielecki, W.: TRACO: source-to-source parallelizing compiler. Comput. Inform. 35(6), 1277–1306 (2016)

    MathSciNet  MATH  Google Scholar 

  45. Penuchot, J., Falcou, J., Khabou, A.: Modern generative programming for optimizing small matrix-vector multiplication. In: 2018 International Conference on High Performance Computing and Simulation (HPCS), pp. 508–514. IEEE (2018)

    Google Scholar 

  46. Quinlan, D., Liao, C.: The ROSE source-to-source compiler infrastructure. In: Cetus Users and Compiler Infrastructure Workshop, in Conjunction with PACT, vol. 2011, p. 1. Citeseer (2011)

    Google Scholar 

  47. Sangaiah, K., et al.: Synchrotrace: synchronization-aware architecture-agnostic traces for lightweight multicore simulation of CMP and HPC workloads. ACM Trans. Arch. Code Optim. (TACO) 15(1), 1–26 (2018)

    Article  Google Scholar 

  48. Shen, D., Song, S.L., Li, A., Liu, X.: CudaAdvisor: LLVM-based runtime profiling for modern GPUs. In: Proceedings of the 2018 International Symposium on Code Generation and Optimization. CGO 2018, pp. 214–227. ACM, New York (2018). https://doi.org/10.1145/3168831

  49. Sulyok, A.A., Balogh, G.D., Reguly, I.Z., Mudalige, G.R.: Improving locality of unstructured mesh algorithms on GPUs. arXiv preprint arXiv:1802.03749 (2018)

  50. Torczon, L., Cooper, K.: Engineering A Compiler, 2nd edn. Morgan Kaufmann Publishers Inc., San Francisco (2007)

    MATH  Google Scholar 

  51. Wahib, M., Maruyama, N.: Scalable kernel fusion for memory-bound GPU applications. In: SC 2014: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, pp. 191–202. IEEE (2014)

    Google Scholar 

  52. Weber, N., Goesele, M.: MATOG: array layout auto-tuning for CUDA. ACM Trans. Archit. Code Optim. 14(3) (2017). https://doi.org/10.1145/3106341

  53. Yilmaz, B.: Runtime specialization and autotuning of sparse matrix-vector multiplication. Ph.D. thesis, Ph. D. dissertation, Ozyegin University (2015)

    Google Scholar 

Download references

Acknowledgements

Sandia National Laboratories is a multimission laboratory managed and operated by National Technology and Engineering Solutions of Sandia, LLC, a wholly owned subsidiary of Honeywell International, Inc., for the U.S. Department of Energy’s National Nuclear Security Administration under contract DE-NA-0003525. Images used by permission. SAND2021-9377 C.

This work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344. LLNL-CONF-821299.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Reed Milewicz .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Milewicz, R., Pirkelbauer, P., Soundararajan, P., Ahmed, H., Skjellum, T. (2021). Negative Perceptions About the Applicability of Source-to-Source Compilers in HPC: A Literature Review. In: Jagode, H., Anzt, H., Ltaief, H., Luszczek, P. (eds) High Performance Computing. ISC High Performance 2021. Lecture Notes in Computer Science(), vol 12761. Springer, Cham. https://doi.org/10.1007/978-3-030-90539-2_16

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-90539-2_16

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-90538-5

  • Online ISBN: 978-3-030-90539-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics