Skip to main content

SHCOLL - A Standalone Implementation of OpenSHMEM-Style Collectives API

  • Conference paper
  • First Online:
Book cover OpenSHMEM and Related Technologies. OpenSHMEM in the Era of Extreme Heterogeneity (OpenSHMEM 2018)

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 11283))

Included in the following conference series:

Abstract

The performance of collective operations has a large impact on overall performance in many HPC applications. Implementing multiple algorithms and selecting optimal one depending on message size and the number of processes involved in the operation is essential to achieve good performance. In this paper, we will present SHCOLL, a collective routines library that was developed on top of OpenSHMEM API point to point operations: puts, gets, atomic memory update, and memory synchronization routines. The library is designed to serve as a plug-in to OpenSHMEM implementations and will be used by the OSSS OpenSHMEM reference implementation to support OpenSHMEM collective operations. In this paper, we describe the algorithms that have been incorporated in the implementation of each OpenSHMEM API collective routine and evaluate them on a Cray XC30 system. For long messages, SHCOLL shows an improvement by up to a factor of 12 compared to the vendor’s implementation. We also discuss future development of the library, as well as how it will be incorporated into the OSSS OpenSHMEM reference implementation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Introduction to barrier algorithms. https://6xq.net/barrier-intro/

  2. MPICH. https://www.mpich.org

  3. MVAPICH2-X. http://mvapich.cse.ohio-state.edu/

  4. PMIx Reference RunTime Environment. https://github.com/pmix/prrte

  5. Awan, A.A., Hamidouche, K., Chu, C.H., Panda, D.: A case for non-blocking collectives in OpenSHMEM: design, implementation, and performance evaluation using MVAPICH2-X. In: Gorentla Venkata, M., Shamis, P., Imam, N., Lopez, M.G. (eds.) OpenSHMEM 2014. LNCS, vol. 9397, pp. 69–86. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-26428-8_5

    Chapter  Google Scholar 

  6. Barnett, M., Shuler, L., van De Geijn, R., Gupta, S., Payne, D.G., Watts, J.: Interprocessor collective communication library (intercom). In: Proceedings of the Scalable High-Performance Computing Conference, pp. 357–364. IEEE (1994)

    Google Scholar 

  7. Bauer, M.E.: Legion: programming distributed heterogeneous architectures with logical regions (2014)

    Google Scholar 

  8. Bonachea, D.: GASNet specification, v1.1. Technical report, Computer Science Department, University of California, Berkeley (2002)

    Google Scholar 

  9. Bruck, J., Ho, C.T., Kipnis, S., Upfal, E., Weathersby, D.: Efficient algorithms for all-to-all communications in multiport message-passing systems. IEEE Trans. Parallel Distrib. Syst. 8(11), 1143–1156 (1997)

    Article  Google Scholar 

  10. ten Buggencate, M., Roweth, D.: DMAPP: an API for one-sided programming models on baker systems. In: Proceedings of Cray User Group (2010)

    Google Scholar 

  11. Castain, R.H., Solt, D., Hursey, J., Bouteiller, A.: Pmix: process management for exascale environments. In: Proceedings of the 24th European MPI Users’ Group Meeting, EuroMPI 2017, pp. 14:1–14:10. ACM, New York (2017). http://doi.acm.org/10.1145/3127024.3127027

  12. Chapman, B., et al.: Introducing OpenSHMEM: SHMEM for the PGAS community. In: Proceedings of the Fourth Conference on Partitioned Global Address Space Programming Model, PGAS 2010, pp. 2:1–2:3. ACM, New York (2010). http://doi.acm.org/10.1145/2020373.2020375

  13. Cray, Inc.: Chapel Language Specification. Technical report, Cray, Inc. (2010)

    Google Scholar 

  14. Cray Inc.: Using the GNI and DMAPP APIs (2011)

    Google Scholar 

  15. Dinan, J., Cole, C., Jost, G., Smith, S., Underwood, K., Wisniewski, R.W.: Reducing synchronization overhead through bundled communication. In: Poole, S., Hernandez, O., Shamis, P. (eds.) OpenSHMEM 2014. LNCS, vol. 8356, pp. 163–177. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-05215-1_12

    Chapter  Google Scholar 

  16. Faanes, G., et al.: Cray cascade: a scalable HPC system based on a dragonfly network. In: Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis (SC 2012), November 2012

    Google Scholar 

  17. Jose, J., Kandalla, K., Zhang, J., Potluri, S., Panda, D.: Optimizing collective communication in openshmem. In: 7th International Conference on PGAS Programming Models, p. 185 (2013)

    Google Scholar 

  18. Knaak, D., Namashivayam, N.: Proposing OpenSHMEM extensions towards a future for hybrid programming and heterogeneous computing. In: Gorentla Venkata, M., Shamis, P., Imam, N., Lopez, M.G. (eds.) OpenSHMEM 2014. LNCS, vol. 9397, pp. 53–68. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-26428-8_4

    Chapter  Google Scholar 

  19. Namashivayam, N., Eachempati, D., Khaldi, D., Chapman, B.M.: OpenSHMEM as a portable communication layer for PGAS models: a case study with coarray fortran. In: 2015 IEEE International Conference on Cluster Computing, CLUSTER 2015, Chicago, IL, USA, 8–11 September 2015, pp. 438–447 (2015). http://dx.doi.org/10.1109/CLUSTER.2015.66

  20. OpenSHMEM Specification Committee: OpenSHMEM Specification. http://www.openshmem.org/site/Specification

  21. Poole, S.W., Hernandez, O., Kuehn, J.A., Shipman, G.M., Curtis, A., Feind, K.: OpenSHMEM - toward a unified RMA model. In: Padua, D. (ed.) Encyclopedia of Parallel Computing. Springer, Boston (2011). https://doi.org/10.1007/978-0-387-09766-4_490

    Chapter  Google Scholar 

  22. Rolf Rabenseifner: A new optimized MPI reduce algorithm. https://fs.hlrs.de/projects/par/mpi//myreduce.html

  23. Chauvin, S., Saha, P., Cantonnet, F., Annareddy, S., El-Ghazawi, T.: UPC Manual (2003)

    Google Scholar 

  24. Shamis, P., et al.: UCX: an open source framework for HPC network APIS and beyond. In: 2015 IEEE 23rd Annual Symposium on High-Performance Interconnects, pp. 40–43, August 2015

    Google Scholar 

  25. Tam, A., Wang, C.L.: Efficient scheduling of complete exchange on clusters. In: 13th International Conference on Parallel and Distributed Computing Systems (PDCS 2000), Las Vegas, vol. 4 (2000)

    Google Scholar 

  26. Thakur, R., Rabenseifner, R., Gropp, W.: Optimization of collective communication operations in MPICH. Int. J. High Perform. Comput. Appl. 19(1), 49–66 (2005)

    Article  Google Scholar 

Download references

Acknowledgments

This research was funded in part by the United States Department of Defense, and was supported by resources at Los Alamos National Laboratory. This publication has been approved for public, unlimited distribution by Los Alamos National Laboratory, with document number LA-UR-18-27273.

This research used resources of the National Energy Research Scientific Computing Center (NERSC), a U.S. Department of Energy Office of Science User Facility operated under Contract No. DE-AC02-05CH11231.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Srđan Milaković .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Milaković, S., Budimlić, Z., Pritchard, H., Curtis, A., Chapman, B., Sarkar, V. (2019). SHCOLL - A Standalone Implementation of OpenSHMEM-Style Collectives API. In: Pophale, S., Imam, N., Aderholdt, F., Gorentla Venkata, M. (eds) OpenSHMEM and Related Technologies. OpenSHMEM in the Era of Extreme Heterogeneity. OpenSHMEM 2018. Lecture Notes in Computer Science(), vol 11283. Springer, Cham. https://doi.org/10.1007/978-3-030-04918-8_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-04918-8_6

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-04917-1

  • Online ISBN: 978-3-030-04918-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics