Abstract
As a lightweight library-based Partitioned Global Address Space (PGAS) programming model, OpenSHMEM provides efficient one-sided and collective communications and is receiving more attention in recent years. However, task-based programming models are getting bigger traction in scientific computing communities. Application developers are attracted by their ability to achieve better load balance in the face of ever-growing application complexity, and the increasing on-node parallelism in modern high-performance computing machines. Although communication contexts provide threads with first-class access to the network in the OpenSHMEM+X model, OpenSHMEM still has very limited ability to perform advanced operations found in other task-based models. For example, compared to the remote procedure call (RPC) mechanism in the UPC++ programming model, more work is required if the signal/wait routines are used to achieve similar remote task launching operations. In this paper, we introduce a lightweight active message (AM) extension to OpenSHMEM that is designed to perform short, non-blocking remote function invocations. This extension aims to bring some benefits of task-based programming to OpenSHMEM without making it a full-blown heavyweight tasking system with a sophisticated scheduler. We study the performance of this active message extension by running micro-benchmarks, and by evaluating its computation efficiency at different task granularities using the TaskBench framework.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Bachan, J., et al.: UPC++: a high-performance communication framework for asynchronous computation. In: Proceedings of the 33rd IEEE International Parallel and Distributed Processing Symposium, IPDP. IEEE (2019). https://doi.org/10.25344/S4V88H, https://escholarship.org/uc/item/1gd059hj
Bauer, M., Treichler, S., Slaughter, E., Aiken, A.: Legion: expressing locality and independence with logical regions. In: SC 2012: Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis, pp. 1–11 (2012). https://doi.org/10.1109/SC.2012.71
Bonachea, Dan, Hargrove, Paul H..: GASNet-EX: a high-performance, portable communication library for Exascale. In: Hall, Mary, Sundar, Hari (eds.) LCPC 2018. LNCS, vol. 11882, pp. 138–158. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-34627-0_11
Chamberlain, B.L.: Chapel (Cray Inc. HPCS Language). In: Padua, D. (ed.) Encyclopedia of Parallel Computing, pp. 249–256. Springer, Boston (2011). https://doi.org/10.1007/978-0-387-09766-4_5
Chapman, B.M., et al.: Introducing openshmem: Shmem for the pgas community. In: PGAS (2010)
Daiß, G., et al.: From piz daint to the stars: simulation of stellar mergers using high-level abstractions. In: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2019, Association for Computing Machinery, New York (2019). https://doi.org/10.1145/3295500.3356221, https://doi.org/10.1145/3295500.3356221
Dinan, J., Flajslik, M.: Contexts: a mechanism for high throughput communication in OpenSHMEM. In: Proceedings of the 8th International Conference on Partitioned Global Address Space Programming Models, pp. 10:1–10:9. ACM, New York (2014). https://doi.org/10.1145/2676870.2676872, http://doi.acm.org/10.1145/2676870.2676872
Eicken, T., Culler, D., Goldstein, S., Schauser, K.: Active messages: a mechanism for integrated communication and computation. In: 1992 Proceedings the 19th Annual International Symposium on Computer Architecture, pp. 256–266 (1992). https://doi.org/10.1109/ISCA.1992.753322
Huang, T.W., Lin, D.L., Lin, Y., Lin, C.X.: Taskflow: a general-purpose parallel and heterogeneous task programming system. IEEE Trans. Comput. Aided Des. Integr. Circuits Syst. (2021). https://doi.org/10.1109/TCAD.2021.3082507
Jana, S., Curtis, T., Khaldi, D., Chapman, B.: Increasing computational asynchrony in OpenSHMEM with active messages. In: Gorentla Venkata, M., Imam, N., Pophale, S., Mintz, T.M. (eds.) OpenSHMEM 2016. LNCS, vol. 10007, pp. 35–51. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-50995-2_3
Kaiser, H., et al.: HPX - the C++ standard library for parallelism and concurrency. J. Open Source Softw. 5(53), 2352 (2020). https://doi.org/10.21105/joss.02352
Kale, L.V., Krishnan, S.: Charm++: a portable concurrent object oriented system based on C++. In: Proceedings of the Eighth Annual Conference on Object-Oriented Programming Systems, Languages, and Applications, OOPSLA 1993, pp. 91–108. Association for Computing Machinery, New York (1993). https://doi.org/10.1145/165854.165874
Lu, W., Curtis, T., Chapman, B.: Enabling low-overhead communication in multi-threaded OpenSHMEM applications using contexts. In: 2019 IEEE/ACM Parallel Applications Workshop, Alternatives To MPI (PAW-ATM), pp. 47–57 (2019). https://doi.org/10.1109/PAW-ATM49560.2019.00010
NVSHMEM. https://developer.nvidia.com/nvshmem
OpenSHMEM Application Programming Interface Version 1.4. http://openshmem.org/site/sites/default/site_files/OpenSHMEM-1.4.pdf
Open Source Software Solutions (OSSS) OpenSHMEM Implementation on top of OpenUCX (UCX) and PMIx. https://github.com/openshmem-org/osss-ucx
Ozog, D., Rahman, M.W.U., Taylor, G., Dinan, J.: Designing, implementing, and evaluating the upcoming OpenSHMEM teams API. In: 2019 IEEE/ACM Parallel Applications Workshop, Alternatives To MPI (PAW-ATM), pp. 37–46 (2019). https://doi.org/10.1109/PAW-ATM49560.2019.00009
Pheatt, C.: Intel® threading building blocks. J. Comput. Sci. Coll. 23(4), 298 (2008)
Sasidharan, A., Snir, M.: MiniAMR - a miniapp for adaptive mesh refinement (2016)
Scherer, W.N., Adhianto, L., Jin, G., Mellor-Crummey, J., Yang, C.: Hiding latency in Coarray Fortran 2.0. In: Proceedings of the Fourth Conference on Partitioned Global Address Space Programming Model, PGAS 2010. Association for Computing Machinery, New York (2010). https://doi.org/10.1145/2020373.2020387
Schuchart, J., Bouteiller, A., Bosilca, G.: Using MPI-3 RMA for active messages. In: 2019 IEEE/ACM Workshop on Exascale MPI (ExaMPI), pp. 47–56 (2019). https://doi.org/10.1109/ExaMPI49596.2019.00011
Shamis, P., et al.: UCX: an open source framework for HPC network APIs and beyond. In: 2015 IEEE 23rd Annual Symposium on High-Performance Interconnects, pp. 40–43. IEEE (2015)
Slaughter, E., et al.: Task bench: a parameterized benchmark for evaluating parallel runtime performance. In: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis. SC 2020. IEEE Press (2020)
Zhao, X., Balaji, P., Gropp, W., Thakur, R.: MPI-interoperable generalized active messages. In: 2013 International Conference on Parallel and Distributed Systems, pp. 200–207 (2013). https://doi.org/10.1109/ICPADS.2013.38
Acknowledgement
This research was funded in part by the United States Department of Defense, and was supported by resources at Los Alamos National Laboratory, operated by Triad National Security, LLC under Contract No. 89233218CNA000001.
The authors would also like to thank Stony Brook Research Computing and Cyberinfrastructure, and the Institute for Advanced Computational Science at Stony Brook University for access to the innovative high-performance Ookami computing system, which was made possible by a $5M National Science Foundation grant (#1927880).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 Springer Nature Switzerland AG
About this paper
Cite this paper
Lu, W., Curtis, T., Chapman, B. (2022). OpenSHMEM Active Message Extension for Task-Based Programming. In: Poole, S., Hernandez, O., Baker, M., Curtis, T. (eds) OpenSHMEM and Related Technologies. OpenSHMEM in the Era of Exascale and Smart Networks. OpenSHMEM 2021. Lecture Notes in Computer Science, vol 13159. Springer, Cham. https://doi.org/10.1007/978-3-031-04888-3_8
Download citation
DOI: https://doi.org/10.1007/978-3-031-04888-3_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-04887-6
Online ISBN: 978-3-031-04888-3
eBook Packages: Computer ScienceComputer Science (R0)