Skip to main content
Log in

Experiences Developing the OpenUH Compiler and Runtime Infrastructure

  • Published:
International Journal of Parallel Programming Aims and scope Submit manuscript

Abstract

The OpenUH compiler is a branch of the open source Open64 compiler suite for C, C++, and Fortran 95/2003, with support for a variety of targets including x86_64, IA-64, and IA-32. For the past several years, we have used OpenUH to conduct research in parallel programming models and their implementation, static and dynamic analysis of parallel applications, and compiler integration with external tools. In this paper, we describe the evolution of the OpenUH infrastructure and how we’ve used it to carry out our research and teaching efforts.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16

Similar content being viewed by others

References

  1. The Open64 compiler. http://www.open64.net (2011)

  2. Addison, C., LaGrone, J., Huang, L., Chapman, B.: OpenMP 3.0 tasking implementation in OpenUH. In: Open64 Workshop in Conjunction with the International Symposium on Code Generation and, Optimization (2009)

  3. Adhianto, L., Chapman, B.: Performance modeling and analysis of hybrid MPI and OpenMP applications. University of Houston Department of Computer Science, technical report (2006)

  4. Adhianto, L., Chapman, B.: Performance modeling of communication and computation in hybrid MPI and OpenMP applications. In: ICPADS ’06: Proceedings of the 12th International Conference on Parallel and Distributed Systems, pp. 3–8. IEEE Computer Society, Washington, DC, USA (2006). doi:10.1109/ICPADS.2006.81

  5. Adve, V.S., Bagrodia, R., Browne, J.C., Deelman, E., Dube, A., Houstis, E.N., Rice, J.R., Sakellariou, R., Sundaram-Stukel, D.J., Teller, P.J., Vernon, M.K.: Poems: end-to-end performance design of large parallel adaptive computational systems. IEEE Trans. Softw. Eng. 26(11), 1027–1048 (2000). doi:10.1109/32.881716

    Article  Google Scholar 

  6. Balart, J., Duran, A., Gonzalez, M., Martorell, X., Ayguade, E., Labarta, J.: Nanos Mercurium: a research compiler for OpenMP. In: The 6th European Workshop on OpenMP (EWOMP ’04). Stockholm, Sweden (2004)

  7. Balasundaram, V., Kennedy, K.: Compile-time detection of race conditions in a parallel program. In: ICS ’89: Proceedings of the 3rd International Conference on Supercomputing, pp. 175–185. ACM Press, Crete, Greece (1989). doi:10.1145/318789.318809

  8. Beddall, A.: The g95 project. http://www.g95.org/coarray.shtml

  9. Bonachea, D.: Gasnet specification, v1.1. Technical report, Berkeley, CA, USA (2002)

  10. Browne, S., Dongarra, J., Garner, N., Ho, G., Mucci, P.: A portable programming interface for performance evaluation on modern processors. Int. J. High Perform. Comput. Appl. 14(3), 189–204 (2000). doi:10.1177/109434200001400303

    Article  Google Scholar 

  11. Brunst, H., Kranzlm ller, D., Nagel, W.E.: Tools for scalable parallel program analysis Vampir VNG and DeWiz. In: DAPSYS, pp. 93–102 (2004)

  12. Buck, B., Hollingsworth, J.K.: An API for runtime code patching. Int. J. High Perform. Comput. Appl. 14(4), 317–329 (2000). citeseer.nj.nec.com/buck00api.html

    Google Scholar 

  13. Bui, V., Hernandez, O., Chapman, B., Kufrin, R., Tafti, D., Gopalkrishnan, P.: Towards an implementation of the OpenMP collector api. In: PARCO (2007)

  14. Callahan, D., Kennedy, K., Subhlok, J.: Analysis of event synchronization in a parallel programming tool. In: PPOPP ’90: Proceedings of the 2nd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, pp. 21–30. ACM Press, Seattle, Washington, USA (1990). doi:10.1145/99163.99167

  15. Chapman, B.M., Huang, L., Jin, H., Jost, G., de Supinski, B.R.: Toward enhancing OpenMP’s work-sharing directives. In: Europar 2006, pp. 645–654 (2006)

  16. Chen, W.Y., Bonachea, D., Iancu, C., Yelick, K.: Automatic nonblocking communication for partitioned global address space programs. In: Proceedings of the 21st Annual International Conference on Supercomputing, ICS ’07, pp. 158–167. ACM, New York, NY, USA, 2007, 10(1145/1274971), pp. 1274995 (2007). doi:10.1145/1274971.1274995

  17. Chen, W.Y., Iancu, C., Yelick, K.: Communication optimizations for fine-grained upc applications. In: Proceedings of the 14th International Conference on Parallel Architectures and Compilation Techniques, PACT ’05, pp. 267–278. IEEE Computer Society, Washington, DC, USA (2005). doi:10.1109/PACT.2005.13

  18. Chow, F., Chan, S., Kennedy, R., Liu, S.M., Lo, R., Tu, P.: A new algorithm for partial redundancy elimination based on ssa form. In: Proceedings of the ACM SIGPLAN 1997 Conference on Programming Language Design and Implementation, PLDI ’97, pp. 273–286. ACM, New York, NY, USA (1997). doi:10.1145/258915.258940

  19. Chow, F.C., Chan, S., Liu, S.M., Lo, R., Streich, M.: Effective representation of aliases and indirect memory operations in SSA form. In: Computational Complexity ’96: Proceedings of the 6th International Conference on Compiler Construction, pp. 253–267. Springer, London, UK (1996)

  20. Dotsenko, Y., Coarfa, C., Mellor-Crummey, J.: A multi-platform co-array fortran compiler. In: PACT ’04: Proceedings of the 13th International Conference on Parallel Architectures and Compilation Techniques, pp. 29–40. IEEE Computer Society, Washington, DC, USA (2004). doi:10.1109/PACT.2004.3

  21. Eachempati, D., Huang, L., Chapman, B.M.: Strategies and implementation for translating OpenMP code for clusters. In: Perrott, R.H., Chapman, B.M., Subhlok, J., de Mello, R.F., Yang, L.T. (eds.) HPCC, Lecture Notes in Computer Science, vol. 4782, pp. 420–431. Springer (2007)

  22. Eachempati, D., Jun, H.J., Chapman, B.: An open-source compiler and runtime implementation for coarray Fortran. In: PGAS ’10. ACM Press, New York, NY, USA (2010)

  23. Fahringer, T., Clovis Seragiotto, J.: Aksum: a performance analysis tool for parallel and distributed applications, pp. 189–208 (2004)

  24. Fahringer, T., J nior, C.S.: Automatic search for performance problems in parallel and distributed programs by using multi-experiment analysis. In: Proceedings of the 9th International Conference On High Performance Computing (HiPC 2002), pp. 151–162. Springer, Bangalore, India (2002)

  25. the GNU compiler collection. http://gcc.gnu.org (2005)

  26. Girona, S., Labarta, J., Badia, R.M.: Validation of dimemas communication model for mpi collective operations. In: Proceedings of the 7th European PVM/MPI Users’ Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface, pp. 39–46. Springer, London, UK (2000)

  27. Han, T.D., Abdelrahman, T.S.: /hi/cuda: a high-level directive-based language for gpu programming. In: GPGPU-2: Proceedings of 2nd Workshop on General Purpose Processing on Graphics Processing Units, pp. 52–61. ACM, New York, NY, USA (2009). doi:10.1145/1513895.1513902

  28. Hernandez, O., Chapman, B.: Compiler support for efficient profiling and tracing. In: Parallel Computing (ParCo 2007) (2007)

  29. Hernandez, O., Chapman, B., et al.: Open source software support for the openmp runtime api for profiling. In: The 2nd International Workshop on Parallel Programming Models and Systems Software for High-End, Computing (P2S2) (2009)

  30. Hernandez, O., Nanjegowda, R.C., Chapman, B.M., Bui, V., Kufrin, R.: Open source software support for the openmp runtime api for profiling. In: ICPP Workshops, pp. 130–137 (2009)

  31. Hernandez, O.R.: Efficient performance tuning methodology with compiler feedback. Ph.D. thesis, Houston, TX, USA (2008). AAI3313493.

  32. Huang, L., Chapman, B., Kendall, R.: OpenMP on distributed memory via global arrays. In: Parallel Computing (PARCO 2003). DRESDEN, Germany (2003)

  33. Huang, L., Chapman, B., Liao, C.: An implementation and evaluation of thread subteam for openmp extensions. In: Programming Models for Ubiquitous Parallelism (PMUP 06). Seattle, WA (2006)

  34. Huang, L., Chapman, B., Liu, Z.: Towards a more efficient implementation of OpenMP for clusters via translation to global arrays. Parallel Comput. 31(10–12) (2005)

    Google Scholar 

  35. Huang, L., Eachempati, D., Hervey, M.W., Chapman, B.: Extending global optimizations in the openUH compiler for openMP. In: Open64 Workshop at CGO 2008, In Conjunction with the International Symposium on Code Generation and Optimization (CGO). Boston, MA (2008)

  36. Huang, L., Jin, H., Chapman, B.: Introducing locality-awareness computation into openmp. In: IWOMP ’10 (2010, submitted)

  37. Huang, L., Jin, H., Yi, L., Chapman, B.: Enabling locality-aware computations in OpenMP. Sci. Program. 18(3), 169–181 (2010)

    Google Scholar 

  38. Huang, L., Sethuraman, G., Chapman, B.: Parallel data flow analysis for openmp programs. In: Proceedings of IWOMP (2007)

  39. Intel: Intel itanium2 Processor Reference Manual for Software Development and Optimization, vol. 1 (2004)

  40. Itzkowitz, M., Mazurov, O., Copty, N., Lin, Y.: White paper: an openmp runtime api for profiling. Technical report, Sun Microsystems, Inc. http://www.compunity.org/futures/omp-api.html. (2007)

  41. Jin, H., Chapman, B., Huang, L.: Performance evaluation of a multi-zone application in different openmp approaches. In: Proceedings of IWOMP (2007)

  42. Johnson, S.P., Evans, E., Jin, H., Ierotheou, C.S.: The parawise expert assistant widening accessibility to efficient and scalable tool generated OpenMP code. In: WOMPAT, pp. 67–82 (2004)

  43. LaGrone, J., Aribuki, A., Addison, C., Chapman, B.M.: A runtime implementation of openmp tasks. In: 7th International Workshop on OpenMP, IWOMP2011, pp. 165–178 (2011)

  44. Lee, J., Padua, D.A., Midkiff, S.P.: Basic compiler algorithms for parallel programs. In: Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP ’99), pp. 1–12. ACM SIGPLAN, Atlanta, Georgia, USA (1999)

  45. Lee, S.I., Johnson, T.A., Eigenmann, R.: Cetus an extensible compiler infrastructure for source-to-source transformation. In: LCPC, pp. 539–553 (2003)

  46. Liao, C., Chapman, B.: Invited paper: a compile-time cost model for OpenMP. In: 12th International Workshop on High-Level Parallel Programming Models and Supportive Environments (HIPS) (March 2007)

  47. Liao, C., Hernandez, O., Chapman, B., Chen, W., Zheng, W.: OpenUH: an optimizing, portable OpenMP compiler. In: 12th Workshop on Compilers for Parallel Computers (2006)

  48. Liao, C., Liu, Z., Huang, L., Chapman, B.: Evaluating OpenMP on chip multithreading platforms. In: 1st International Workshop on OpenMP. Eugene, Oregon, USA (2005)

  49. Liao, C., Quinlan, D.J., Panas, T., de Supinski, B.R.: A rose-based openmp 3.0 research compiler supporting multiple runtime libraries. In: Sato, M., Hanawa, T., M ller, M.S., Chapman, B.M., de Supinski, B.R. (eds.) IWOMP, Lecture Notes in Computer Science, vol. 6132, pp. 15–28. Springer (2010)

  50. Malony, A.D., Shende, S., Bell, R., Li, K., Li, L., Trebon, N.: Advances in the tau performance system. Performance Analysis and Grid, Computing, pp. 129–144 (2004)

  51. Mellor-Crummey, J., Adhianto, L., Scherer, W.: A new vision for Coarray Fortran. In: PGAS ’09. Rice University (2009)

  52. MetaSim: www.sdsc.edu/pmac/metasim/metasim.html

  53. Moene, T.: Towards an implementation of Coarrays in GNU Fortran. http://ols.fedoraproject.org/GCC/Reprints-2008/moene.reprint.pdf

  54. Mohr, B., Wolf, F.: KOJAK a tool set for automatic performance analysis of parallel applications. In: Proceedings of the European Conference on Parallel Computing (EuroPar), pp. 1301–1304 (2003)

  55. Nanjegowda, R.C., Hernandez, O., Chapman, B.M., Jin, H.: Scalability evaluation of barrier algorithms for openmp. In: IWOMP, pp. 42–52 (2009)

  56. Nieplocha, J., Carpenter, B.: ARMCI: a portable remote memory copy library for distributed array libraries and compiler run-time systems. In: Proceedings of the 11 IPPS/SPDP ’99 Workshops Held in Conjunction with the 13th International Parallel Processing Symposium and 10th Symposium on Parallel and Distributed Processing, pp. 533–546. Springer (1999)

  57. for Non-Experts, I.O.T.: www.cepba.upc.es/intone

  58. OpenMP: simple, portable, scalable SMP programming. http://www.openmp.org (2006)

  59. The OpenUH compiler project. http://www.cs.uh.edu/ openuh (2005)

  60. Petersen, P., Shah, S.: OpenMP support in the Intel Thread Checker. In: Proceedings of the Workshop on OpenMP Applications and Tools (WOMPAT). Toronto, Ontario, Canada (2003)

  61. Pillet, V., Labarta, J., Cortes, T., Girona, S.: PARAVER: a tool to visualize and analyze parallel code. In: Nixon, P. (ed.) Proceedings of WoTUG-18: Transputer and Occam Developments, pp. 17–31 (1995)

  62. Ranger, C., Raghuraman, R., Penmetsa, A., Bradski, G., Kozyrakis, C.: Evaluating MapReduce for multi-core and multiprocessor systems. In: In HPCA O07: Proceedings of the 13th International Symposium on High-Performance Computer, Architecture (2007)

  63. de Rose, L.A., Reed, D.A.: SvPablo: a multi-language architecture-independent performance analysis system. In: ICPP ’99: Proceedings of the 1999 International Conference on Parallel Processing, p. 311. IEEE Computer Society, Washington, DC, USA (1999)

  64. Sato, M., Satoh, S., Kusano, K., Tanaka, Y.: Design of openmp compiler for an smp cluster. In: In EWOMP ’99, pp. 32–39 (1999)

  65. TAU Tuning and Analysis Utilites. http://tau.uoregon.edu (2008)

  66. Wicaksono, B., Tolubaeva, M., Chapman, B.M.: Detecting false sharing in openmp applications using the darwin framework. In: In Proceedings of 24th International Workshop on Languages and Compilers for Parallel Computing (2011)

  67. Wolf, M.E., Maydan, D.E., Chen, D.K.: Combining loop transformations considering caches and scheduling. In: MICRO 29: Proceedings of the 29th Annual ACM/IEEE International Symposium on Microarchitecture, pp. 274–286. IEEE Computer Society, Washington, DC, USA (1996)

  68. Wolf, M.E., Maydan, D.E., Chen, D.K.: Combining loop transformations considering caches and scheduling. Int. J. Parallel Program. 26(4), 479–503 (1998). doi:10.1023/A:1018754616274

    Article  Google Scholar 

  69. Yotov, K., Li, X., Ren, G., Cibulskis, M., DeJong, G., Garzaran, M., Padua, D., Pingali, K., Stodghill, P., Wu, P.: A comparison of empirical and model-driven optimization. In: PLDI ’03: Proceedings of the ACM SIGPLAN 2003 conference on Programming language design and implementation, pp. 63–76. ACM Press, New York, NY, USA (2003). doi:10.1145/781131.781140

Download references

Acknowledgments

We would like to thanks our funding agencies for their support. The work described in this paper was funded by the following grants: National Science Foundation under contracts CCF-0444468, CCF-0702775, CCF-0833201; Department of Energy under contracts DE-FC03-01ER25502, DE-FC02-06ER25759. Support for our CAF implementation was partially sponsored by Total.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Deepak Eachempati.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chapman, B., Eachempati, D. & Hernandez, O. Experiences Developing the OpenUH Compiler and Runtime Infrastructure. Int J Parallel Prog 41, 825–854 (2013). https://doi.org/10.1007/s10766-012-0230-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10766-012-0230-9

Keywords

Navigation