A Characterization of Shared Data Access Patterns in UPC Programs

Barton, Christopher; Caşcaval, Călin; Amaral, José Nelson

doi:10.1007/978-3-540-72521-3_9

Christopher Barton¹,
Călin Caşcaval² &
José Nelson Amaral¹

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 4382))

Included in the following conference series:

International Workshop on Languages and Compilers for Parallel Computing

585 Accesses

Abstract

The main attraction of Partitioned Global Address Space (PGAS) languages to programmers is the ability to distribute the data to exploit the affinity of threads within shared-memory domains. Thus, PGAS languages, such as Unified Parallel C (UPC), are a promising programming paradigm for emerging parallel machines that employ hierarchical data- and task-parallelism. For example, large systems are built as distributed-shared memory architectures, where multi-core nodes access a local, coherent address space and many such nodes are interconnected in a non-coherent address space to form a high-performance system.

This paper studies the access patterns of shared data in UPC programs. By analyzing the access patterns of shared data in UPC we are able to make three major observations about the characteristics of programs written in a PGAS programming model: (i) there is strong evidence to support the development of automatic identification and automatic privatization of local shared data accesses; (ii) the ability for the programmer to specify how shared data is distributed among the executing threads can result in significant performance improvements; (iii) running UPC programs on a hybrid architecture will significantly increase the opportunities for automatic privatization of local shared data accesses.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Portable Node-Level Parallelism for the PGAS Model

Article Open access 05 June 2021

Remote Procedure Calls for Improved Data Locality with the Epiphany Architecture

A Methodology Approach to Compare Performance of Parallel Programming Models for Shared-Memory Architectures

References

Sobel - wikipedia, the free encyclopedia. http://en.wikipedia.org/wiki/Sobel
Amza, C., et al.: TreadMarks: Shared memory computing on networks of workstations. IEEE Computer 29(2), 18–28 (1996), citeseer.ist.psu.edu/amza96treadmarks.html
Google Scholar
Bailey, D., et al.: The NAS parallel benchmarks. Technical Report RNR-94-007, NASA Ames Research Center (March 1994)
Google Scholar
Bailey, D., et al.: The NAS parallel benchmarks 2.0. Technical Report NAS-95-020, NASA Ames Research Center (December 1995)
Google Scholar
Barton, C., et al.: Shared memory programming for large scale machines. In: PLDI, June, pp. 108–117 (2006)
Google Scholar
Berlin, K., et al.: Evaluating the impact of programming language features on the performance of parallel applications on cluster architectures. In: Rauchwerger, L. (ed.) LCPC 2003. LNCS, vol. 2958, Springer, Heidelberg (2004)
Google Scholar
Bonachea, D.: GASNet specification, v1.1. Technical Report CSD-02-1207, U.C. Berkeley (November 2002)
Google Scholar
Cantonnet, F., et al.: Performance monitoring and evaluation of a UPC implementation on a NUMA architecture. In: International Parallel and Distributed Processing Symposium (IPDPS), Nice, France, April, p. 274 (2003)
Google Scholar
Cappello, F., Etiemble, D.: MPI versus MPI+OpenMP on the IBM SP for the NAS benchmarks. In: ACM/IEEE Supercomputing, November 2000, IEEE Computer Society Press, Los Alamitos (2000)
Google Scholar
Cascaval, C., et al.: Performance and environment monitoring for continuous program optimization. IBM Journal of Research and Development 50(2/3) (2006)
Google Scholar
Chen, W.-Y., et al.: A performance analysis of the Berkeley UPC compiler. In: Proceedings of the 17th Annual International Conference on Supercomputing (ICS’03), June (2003)
Google Scholar
Coarfa, C., et al.: An evaluation of global address space languages: Co-Array Fortran and Unified Parallel C. In: Symposium on Principles and Practice of Parallel Programming (PPoPP), Chicago, IL, June, pp. 36–47 (2005)
Google Scholar
UPC Consortium. UPC Language Specification, V1.2 (May 2005)
Google Scholar
IBM Corporation. IBM XL UPC compilers (November 2005), http://www.alphaworks.ibm.com/tech/upccompiler
Costa, J.J., et al.: Running OpenMP applications efficiently on an everything-shared SDSM. Journal of Parallel and Distributed Computing 66, 647–658 (2005)
Article Google Scholar
Dwarkadas, S., Cox, A.L., Zwaenepoel, W.: An integrated compile-time/run-time software distributed shared memory system. In: Architectural Support for Programming Languages and Operating Systems, Cambridge, MA, pp. 186–197 (1996)
Google Scholar
El-Ghazawi, T., Cantonnet, F.: UPC performance and potential: a NPB experimental study. In: Proceedings of the 2002 ACM/IEEE conference on Supercomputing (SC’02), Baltimore, Maryland, USA, November 2002, pp. 1–26. IEEE Computer Society Press, Los Alamitos (2002)
Google Scholar
Hoeflinger, J.P.: Extending OpenMP to clusters (2006), http://www.intel.com/cd/software/products/asmo-na/eng/compilers/285865.htm
Keleher, P.: Distributed Shared Memory Using Lazy Release Consistency. PhD thesis, Rice University (December 1994)
Google Scholar
Numrich, R.W., Reid, J.: Co-array fortran for parallel programming. ACM SIGPLAN Fortran Forum 17(2), 1–31 (1998)
Article Google Scholar
Smith, L., Bull, M.: Development of mixed mode mpi / openmp applications (Presented at Workshop on OpenMP Applications and Tools (WOMPAT 2000), San Diego, Calif., July 6-7, 2000). Scientific Programming 9(2-3), 83–98 (2001), citeseer.ist.psu.edu/article/smith00development.html
Google Scholar
Wisniewski, R.W., et al.: Performance and environment monitoring for whole-system characterization and optimization. In: PAC2 - Power Performance equals Architecture x Circuits x Compilers, Yorktown Heights, NY, October 6-8, pp. 15–24 (2004)
Google Scholar
Zhang, Z., Seidel, S.: Benchmark measurements of current UPC platforms. In: IPDPS, Denver, CO, April (2005)
Google Scholar
Zhang, Z., Seidel, S.: A performance model for fine-grain accesses in UPC. In: IPDPS, Rhodes Island, Greece, April (2006)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computing Science, University of Alberta, Edmonton, Canada
Christopher Barton & José Nelson Amaral
IBM T.J. Watson Research Center, Yorktown Heights, NY, 10598,
Călin Caşcaval

Authors

Christopher Barton
View author publications
You can also search for this author in PubMed Google Scholar
Călin Caşcaval
View author publications
You can also search for this author in PubMed Google Scholar
José Nelson Amaral
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

George Almási Călin Caşcaval Peng Wu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Barton, C., Caşcaval, C., Amaral, J.N. (2007). A Characterization of Shared Data Access Patterns in UPC Programs. In: Almási, G., Caşcaval, C., Wu, P. (eds) Languages and Compilers for Parallel Computing. LCPC 2006. Lecture Notes in Computer Science, vol 4382. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-72521-3_9

Download citation

DOI: https://doi.org/10.1007/978-3-540-72521-3_9
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-72520-6
Online ISBN: 978-3-540-72521-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics