Abstract
With the emergence of many-core processor architectures onto the HPC scene, concerns arise regarding the performance and productivity of numerous existing parallel-programming tools, models, and languages. As these devices begin augmenting conventional distributed cluster systems in an evolving age of heterogeneous supercomputing, proper evaluation and profiling of many-core processors must occur in order to understand their performance and architectural strengths with existing parallel-programming environments and HPC applications. This paper presents and evaluates the comparative performance between two many-core processors, the Tilera TILE-Gx8036 and the Intel Xeon Phi 5110P, in the context of their applications performance with the SHMEM and OpenMP parallel-programming environments. Several applications written or provided in SHMEM and OpenMP are evaluated in order to analyze the scalability of existing tools and libraries on these many-core platforms. Our results show that SHMEM and OpenMP parallel applications scale well on the TILE-Gx and Xeon Phi, but heavily depend on optimized libraries and instrumentation.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Bailey, D., Barszcz, E., Barton, J., Browning, D., Carter, R., Dagum, L., Fatoohi, R., Fineberg, S., Frederickson, P., Lasinski, T., Schreiber, R., Simon, H., Venkatakrishnan, V., Weeratunga, S.: The NAS Parallel Benchmarks. Tech. Rep. RNR-94-007, NASA Advanced Supercomputing Division (1994)
Bonachea, D.: GASNet specification, v1.1. Tech. rep., University of California at Berkeley, Berkeley, CA, USA (2002)
Dagum, L., Menon, R.: OpenMP: an industry standard API for shared-memory programming. IEEE Computational Science Engineering 5(1), 46–55 (1998)
Frigo, M., Johnson, S.G.: The design and implementation of FFTW3. Proceedings of the IEEE 93(2), 216–231 (2005)
Gropp, W., Lusk, E., Doss, N., Skjellum, A.: A high-performance, portable implementation of the MPI message passing interface standard. Parallel Computing 22(6), 789–828 (1996)
Intel Corporation: Intel Xeon Phi coprocessor 5110P (2013), http://ark.intel.com/products/71992/
Lam, B.C., George, A.D., Lam, H.: TSHMEM: shared-memory parallel computing on Tilera many-core processors. In: Proc. of 18th International Workshop on High-Level Parallel Programming Models and Supportive Environments, HIPS 2013. IEEE (2013)
Mellanox Technologies: Mellanox ScalableSHMEM (2013), http://www.mellanox.com/related-docs/prod_software/PB_ScalableSHMEM.pdf
Silicon Graphics International Corp.: SHMEM API for parallel programming (2013), http://www.shmem.org/
Tilera Corporation: TILE-Gx8036 processor family (2013), http://www.tilera.com/products/processors/TILE-Gx_Family
University of Houston: OpenSHMEM source releases (2013), http://openshmem.org/site/Downloads/Source
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Lam, B.C., Barboza, A., Agrawal, R., George, A.D., Lam, H. (2014). Benchmarking Parallel Performance on Many-Core Processors. In: Poole, S., Hernandez, O., Shamis, P. (eds) OpenSHMEM and Related Technologies. Experiences, Implementations, and Tools. OpenSHMEM 2014. Lecture Notes in Computer Science, vol 8356. Springer, Cham. https://doi.org/10.1007/978-3-319-05215-1_3
Download citation
DOI: https://doi.org/10.1007/978-3-319-05215-1_3
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-05214-4
Online ISBN: 978-3-319-05215-1
eBook Packages: Computer ScienceComputer Science (R0)